Genes encoding baeyer-villiger monooxygenases

ABSTRACT

Genes have been isolated from a variety of bacteria encoding Baeyer-Villiger monooxygenase activity. The genes and their products are useful for the conversion of ketones to the corresponding esters. A series of motifs, common to all genes, has been identified as diagnostic for genes encoding proteins of this activity.

FIELD OF THE INVENTION

[0001] The invention relates to the field of molecular biology andmicrobiology. More specifically, genes have been isolated from a varietyof bacteria encoding Baeyer-Villiger monooxygenase activity.

BACKGROUND OF THE INVENTION

[0002] In 1899, Baeyer and Villiger reported on a reaction of cyclicketones with peroxymonosulfuric acid to produce lactones (Chem Ber32:3625-3633 (1899)). Since then, the Baeyer-Villiger (BV) reaction hasbeen broadly used in organic synthesis. BV reactions are one of only afew methods available for cleaving specific carbon-carbon bonds undermild conditions, thereby converting ketones into esters (Walsh and Chen,Angew.Chem.Int.Ed. Engl 27:333-343 (1988)).

[0003] In the last several decades, the importance of minimizingenvironmental impact in industrial processes has catalyzed a trendwhereby alternative methods are replacing established chemicaltechniques. In the arena of Baeyer-Villiger (BV) oxidations,considerable interest has focused on discovery of enantioselectiveversions of the Baeyer-Villiger oxidation that are not based onperacids. Enzymes, which are often enantioselective, are valuedalternatives as renewable, biodegradable resources.

[0004] Many microbial Baeyer-Villiger monooxygenases enzymes (BVMOs),which convert ketones to esters or the corresponding lactones (cyclicesters) (Stewart, Curr. Org. Chem. 2:195-216 (1998), have beenidentified from both bacterial and fungal sources. In general, microbialBV reactions are carried out by monooxygenases (EC 1.14.13.x) which useO₂ and either NADH or NADPH as a co-reductant. One of the oxygen atomsis incorporated into the lactone product between the carbonyl carbon andthe flanking carbon while the other is used to oxidize the reduced NADPHproducing H₂O (Banerjee, A. In Stereosel, Biocatal.; Patel, R. N., Ed.;Marcel Dekker: New York, 2000; Chapter 29, pp 867-876). All known BVMOshave a flavin coenzyme which acts in the oxidation reaction; thepredominant coenzyme form is flavin adenine dinucleotide cofactor (FAD).

[0005] The natural physiological role of most characterized BVMOs isdegradation of compounds to permit utilization of smaller hydrocarbonsand/or alcohols as sources of carbon and energy. As a result of this,BVMOs display remarkably broad substrate acceptance, highenantioselectivies, and great stereoselctivity and regioselectivity(Mihovilovic et al. J. Org. Chem. 66:733-738 (2001). Suitable substratesfor the enzymes can be broadly classified as cyclic ketones,ketoterpenes, and steroids. However, few enzymes have been subjected toextensive biochemical characterization. Key studies in relation to eachbroad ketone substrate class are summarized below.

[0006] 1. Cyclic ketones: Activity of cyclohexanone monooxygenase uponcyclic ketone substrates in Acinetobacter sp. NCIB 9871 has been studiedextensively (reviewed in Stewart, Curr. Org. Chem. 2:195-216 (1998),Table 2; Walsh and Chen, Angew.Chem.Int.Ed. Engl27:333-343 (1988),Tables 4-5). Specificity has also been biochemically analyzed inBrevibacterium sp. HCU (Brzostowicz et al., J. Bact. 182(15):4241-4248(2000)).

[0007] 2. Ketoterpenes: A monocyclic monoterpene ketone monooxygenasehas been characterized from Rhodococcus erythropolis DCL14 (Van derWerf,J. Biochem. 347:693-701 (2000)). In addition to broad substratespecificity against ketoterpenes, the enzyme also has activity againstsubstituted cyclohexanones.

[0008] 3. Steroids: The steroid monooxygenase of Rhodococcus rhodochrous(Morii et al. J. Biochem 126:624-631 (1999)) is well characterized, bothbiochemically and by sequence data.

[0009] The genes and gene products listed above are useful for specificBaeyer-Villiger reactions targeted toward cyclic ketone, ketoterpene, orsteroid compounds, however the enzymes are limited in their ability topredict other newly discovered proteins which would have similaractivity.

[0010] The problem to be solved, therefore is to provide a suite ofbacterial flavoprotein Baeyer-Villiger monooxygenase enzymes that canefficiently perform oxygenation reactions on cyclic ketones andketoterpenes compounds. Identity of a suite of enzymes with this broadsubstrate acceptance would facilitate commercial applications of theseenzymes and reduce efforts with respect to optimization of multipleenzymes for multiple reactions. Maximum efficiency is especiallyrelevant today, when many enzymes are genetically engineered such thatthe enzyme is recombinantly expressed in a desirable host organism.Additionally, a collection of BVMO's with diverse amino acid sequencescould be used to create a general predictive model based on amino acidsequence conservation of other BVMO enzymes. Finally, a broad class ofBVMO's could also be used as basis for the in vitro evolution of novelenzymes

[0011] Applicants have solved the stated problem by isolating severalnovel organisms with BVMO activity, identifying and characterizing BMVOgenes, expressing these genes in microbial hosts, and demonstratingactivity of the genes against a wide range of ketone substrates,including cyclic ketones and ketoterpenes. Several signature sequenceshave been identified, based on amino acid sequence alignments, which arecharacteristic of specific BVMO families and have diagnostic utility.

SUMMARY OF THE INVENTION

[0012] The invention provides an isolated nucleic acid fragment isolatedfrom Rhodococcus selected from the group consisting of:

[0013] (a) an isolated nucleic acid fragment encoding a Baeyer-Villigermonooxygenase polypeptide having an amino acid sequence selected fromthe group consisting of SEQ ID NOs:8, 10, 22, 24, 26, 28, 30, 32, 34,36, 38,40, 42, 44, and 46.

[0014] (b) an isolated nucleic acid molecule encoding a Baeyer-Villigermonooxygenase polypeptide that hybridizes with (a) under the followinghybridization conditions: 0.1×SSC, 0.1% SDS, 65° C. and washed with2×SSC, 0.1% SDS followed by 0.1×SSC, 0.1% SDS; or

[0015] an isolated nucleic acid fragment that is complementary to (a) or(b).

[0016] Similarly the invention provides an isolated nucleic acidfragment isolated from Arthrobacter selected from the group consistingof:

[0017] (a) an isolated nucleic acid fragment encoding a Baeyer-Villigermonooxygenase polypeptide having an amino acid sequence as set forth inSEQ ID NO:12;

[0018] (b) an isolated nucleic acid molecule encoding a Baeyer-Villigermonooxygenase polypeptide that hybridizes with (a) under the followinghybridization conditions: 0.1×SSC, 0.1% SDS, 65° C. and washed with2×SSC, 0.1% SDS followed by 0.1×SSC, 0.1% SDS; or

[0019] an isolated nucleic acid fragment that is complementary to (a),or (b).

[0020] Additionally the invention provides an isolated nucleic acidfragment isolated from Acidovorax selected from the group consisting of:

[0021] (a) an isolated nucleic acid fragment encoding a Baeyer-Villigermonooxygenase polypeptide having an amino acid sequence as set forth inSEQ ID NO:18

[0022] (b) an isolated nucleic acid molecule encoding a Baeyer-Villigermonooxygenase polypeptide that hybridizes with (a) under the followinghybridization conditions: 0.1×SSC, 0.1% SDS, 65° C. and washed with2×SSC, 0.1% SDS followed by 0.1×SSC, 0.1% SDS; or

[0023] an isolated nucleic acid fragment that is complementary to (a),or (b).

[0024] In additional embodiments the invention provides polypeptidesencoded by the present sequences as well as genetic chimera of thepresent sequences and transformed hosts expressing the same.

[0025] In a preferred embodiment the invention provides a method for theidentification of a polypeptide having monooxygenase activitycomprising:

[0026] (a) obtaining the amino acid sequence of a polypeptide suspectedof having monooxygenase activity; and

[0027] (b) aligning the amino acid sequence of step (a) with the aminoacid sequence of a Baeyer-Villiger monooxygenase consensus sequenceselected from the group consisting of SEQ ID NO:47, SEQ ID NO:48 and SEQID NO:49,

[0028] wherein where at least 80% of the amino acid residues atpositions p1-p74 of SEQ ID NO:47, or at least 80% of the amino acidresidues at p1-p76 of SEQ ID NO:48 or at least 80% of the amino acidresidues of p1-p41 of SEQ ID NO:49 are completely conserved, thepolypeptide of (a) is identified as having monooxygenase activity.

[0029] In an alternate embodiment the invention provides a method foridentifying a gene encoding a Baeyer-Villiger monooxygenase polypeptidecomprising:

[0030] (a) probing a genomic library with a nucleic acid fragmentencoding a polypeptide wherein where at least 80% of the amino acidresidues at positions p1-p74 of SEQ ID NO:47, or at least 80% of theamino acid residues at p1-p76 of SEQ ID NO:48 or at least 80% of theamino acid residues of p1-p41 of SEQ ID NO:49 are completely conserved;

[0031] (b) identifying a DNA clone that hybridizes with a nucleic acidfragment of step (a);

[0032] (c) sequencing the genomic fragment that comprises the cloneidentified in step (b),

[0033] wherein the sequenced genomic fragment encodes a Baeyer-Villigermonooxygenase polypeptide.

[0034] In a preferred embodiment the invention provides a method for thebiotransformation of a ketone substrate to the corresponding ester,comprising: contacting a transformed host cell under suitable growthconditions with an effective amount of ketone substrate whereby thecorresponding ester is produced, said transformed host cell comprising anucleic acid fragment encoding an isolated nucleic acid fragment of anyof the present nucleic acid sequences; under the control of suitableregulatory sequences.

[0035] In an alternate embodiment the invention provides a method forthe in vitro transformation of a ketone substrate to the correspondingester, comprising: contacting a ketone substrate under suitable reactionconditions with an effective amount of a Baeyer-Villiger monooxygenaseenzyme, the enzyme having an amino acid seqeunce selected from the groupconsisting of SEQ ID NOs:8, 10, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40,42, 44, and 46.

[0036] Additionally the invention provides a mutated microbial geneencoding a protein having an altered biological activity produced by amethod comprising the steps of:

[0037] (i) digesting a mixture of nucleotide sequences with restrictionendonucleases wherein said mixture comprises:

[0038] a) a native microbial gene selected from the group consisting ofSEQ ID NOs:7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27,29, 31, 33, 35, 37,39, 41, 43, and 45;

[0039] b) a first population of nucleotide fragments which willhybridize to said native microbial sequence;

[0040] c) a second population of nucleotide fragments which will nothybridize to said native microbial sequence;

[0041] wherein a mixture of restriction fragments are produced;

[0042] (ii) denaturing said mixture of restriction fragments;

[0043] (iii) incubating the denatured said mixture of restrictionfragments of step (ii) with a polymerase;

[0044] (iv) repeating steps (ii) and (iii) wherein a mutated microbialgene is produced encoding a protein having an altered biologicalactivity. Additionally the invention provides unique strains ofAcidovorax sp. comprising the 16s rDNA sequence as set forth in SEQ IDNO:5, Arthrobacter sp. comprising the 16s rDNA sequence as set forth inSEQ ID NO:1, and Rhodococcus sp. comprising the 16s rDNA sequence as setforth in SEQ ID NO:6.

[0045] In another embodiment the invention provides an Acidovorax sp.comprising the 16s rDNA sequence as set forth in SEQ ID NO:5.

[0046] Additionally the invention provides an Arthrobacter sp.comprising the 16s rDNA sequence as set forth in SEQ ID NO:1. Similarlythe invention provides a Rhodococcus sp. comprising the 16s rDNAsequence as set forth in SEQ ID NO:6.

[0047] Additionally the invention provides an isolated nucleic aciduseful for the identification of a BV monooxygenase selected from thegroup consisting of SEQ ID 70-113.

BRIEF DESCRIPTION OF THE DRAWINGS, AND SEQUENCE DESCRIPTIONS

[0048]FIGS. 1, 2, 3, 4, and 5 show chnB monooxygenase activity ofBrevibactedum sp. HCU, Acinetobacter SE19, Rhodococcus sp. phi1,Rhodococcus sp. phi2, Arthrobacter sp. BP2 and Acidovorax sp. CHX genesover-expressed in E. coli assayed against various ketone substrates.

[0049]FIG. 6 illustrates the signature sequences of the three BVMOgroups based on the consensus sequences derived from the alignments ofFIG. 7, FIG. 8 and FIG. 9.

[0050]FIG. 7 shows a Clustal W alignment of a family of Baeyer-Villigermonoxygenases (Family 1) and the associated signature sequence.

[0051]FIG. 8 shows a Clustal W alignment of a family of Baeyer-Villigermonoxygenases (Family 2) and the associated signature sequence.

[0052]FIG. 9 shows a Clustal W alignment of a family of BC monoxygenases(Family 3) and the associated signature sequence.

[0053] The invention can be more fully understood from the followingdetailed description and the accompanying sequence descriptions whichform a part of this application.

[0054] The following sequences conform with 37 C.F.R. 1.821-1.825(“Requirements for Patent Applications Containing Nucleotide Sequencesand/or Amino Acid Sequence Disclosures—the Sequence Rules”) andconsistent with World Intellectual Property Organization (WIPO) StandardST.25 (1998) and the sequence listing requirements of the EPO and PCT(Rules 5.2 and 49.5(a-bis), and Section 208 and Annex C of theAdministrative Instructions). The symbols and format used for nucleotideand amino acid sequence data comply with the rules set forth in 37C.F.R. §1.822.

[0055] SEQ ID NOs:1-49 are full length genes or proteins as identifiedin Table 1. TABLE 1 Summary of Gene and Protein SEQ ID Numbers GeneProtein SEQ ID SEQ ID Gene Name Organism No No 16s rDNA sequenceArthrobacter sp. BP2 1 — 16s rDNA sequence Rhodococcus sp. phi1 2 — 16srDNA sequence Rhodococcus sp. phi2 3 — 16s rDNA sequence Brevibacteriumsp. HCU 4 — 16s rDNA sequence Acidovorax sp. CHX 5 — 16s rDNA sequenceRhodococcus 6 — erythropolis AN12 chnB Monooxygenase Rhodococcus sp.phi1 7 8 phi1 chnB Monooxygenase Rhodococcus sp. phi2 9 10 phi2 chnBMonooxygenase Arthrobacter sp. BP2 11 12 BP2 chnB1 MonooxygenaseBrevibacterium sp. HCU 13 14 HCU #1 chnB2 Monooxygenase Brevibacteriumsp. HCU 15 16 HCU #2 chnB Monooxygenase Acidovorax sp. CHX 17 18 CHXchnB Monooxygenase Acinetobacter sp. SE19 19 20 SE19 ORF 8 chnBRhodococcus 21 22 Monooxygenase (1413) erythropolis AN12 ORF 9 chnBRhodococcus 23 24 Monooxygenase (1985) erythropolis AN12 ORF 10 chnBRhodococcus 25 26 Monooxygenase (1273) erythropolis AN12 ORF 11 chnBRhodococcus 27 28 Monooxygenase (2034) erythropolis AN12 ORF 12 chnBRhodococcus 29 30 Monooxygenase (1870) erythropolis AN12 ORF 13 chnBRhodococcus 31 32 Monooxygenase (1861) erythropolis AN12 ORF 14 chnBRhodococcus 33 34 Monooxygenase (2005) erythropolis AN12 ORF 15 chnBRhodococcus 35 36 Monooxygenase (2035) erythropolis AN12 ORF 16 chnBRhodococcus 37 38 Monooxygenase (2022) erythropolis AN12 ORF 17 chnBRhodococcus 39 40 Monooxygenase (1976) erythropolis AN12 ORF 18 chnBRhodococcus 41 42 Monooxygenase (1294) erythropolis AN12 ORF 19 chnBRhodococcus 43 44 Monooxygenase (2082) erythropolis AN12 ORF 20 chnBRhodococcus 45 46 Monooxygenase (2093) erythropolis AN12 SignatureSequence #1 Consensus Sequence — 47 Signature Sequence #2 ConsensusSequence — 48 Signature Sequence #3 Consensus Sequence — 49

[0056] SEQ ID NOs:50-62 are primers used for 16s rDNA sequencing.

[0057] SEQ ID NO:63 describes a primer used for RT-PCR and out-PCR.

[0058] SEQ ID NOs:64 and 65 are primers used for sequencing of insertswithin pCR2.1

[0059] SEQ ID NOs:66 and 67 are primers used to amplify monooxygenasegenes from Acinetobacter sp. SE19.

[0060] SEQ ID NOs:68-107 are primers used for amplification of fulllength Baeyer-Villiger monooxygenases.

[0061] SEQ ID NOs:108-113 are primers used to screen cosmid libraries.

DETAILED DESCRIPTION OF THE INVENTION

[0062] The invention provides nucleic acid and amino acid sequencesdefining a group of Baeyer-Villiger monooxygenase enzymes. These enzymeshave been found to have the ability to use a wide variety of ketonesubstrates that include two general classes of compounds, cyclic ketonesand ketoterpenes. These enzymes are characterized by function as well asa series of diagnostic signature sequences. The enzymes may be expressedrecombinantly for the conversion of ketone substrates to thecorresponding lactones or esters.

[0063] In this disclosure, a number of terms and abbreviations are used.The following definitions are provided.

[0064] “Open reading frame” is abbreviated ORF.

[0065] “Polymerase chain reaction” is abbreviated PCR.

[0066] “Gas Chromatography Mass spectrometry” is abbreviated GC-MS.

[0067] “Baeyer-Villiger” is abbreviated BV.

[0068] “Baeyer-Villiger monooxygenase” is abbreviated BVMO.

[0069]

[0070] The term “Baeyer-Villiger monooxygenase”, refers to a bacterialenzyme that has the ability to oxidize a ketone substrate to thecorresponding lactone or ester.

[0071] The term “ketone substrate” includes a substrate for aBaeyer-Villiger monooxygenase that comprises a class of compounds whichinclude cyclic ketones and ketoterpenes. Ketone substrates of theinvention are defined by the general formula:

[0072] wherein R and R₁ are independently selected from substituted orunsubstituted phenyl, substituted or unsubstituted alkyl, substituted orunsubstituted alkenyl, or substituted or unsubstituted alkylidene.

[0073] The term “alkyl” will mean a univalent group derived from alkanesby removal of a hydrogen atom from any carbon atom: C_(n)H_(2n+1)—. Thegroups derived by removal of a hydrogen atom from a terminal carbon atomof unbranched alkanes form a subclass of normal alkyl (n-alkyl) groups:H[CH₂]_(n)—. The groups RCH₂—, R₂CH— (R not equal to H), and R₃C— (R notequal to H) are primary, secondary and tertiary alkyl groupsrespectively.

[0074] The term “alkenyl” will mean an acyclic branched or unbranchedhydrocarbon having one carbon-carbon double bond and the general formulaC_(n)H_(2n). Acyclic branched or unbranched hydrocarbons having morethan one double bond are alkadienes, alkatrienes, etc.

[0075] The term “alkylidene” will mean the divalent groups formed fromalkanes by removal of two-hydrogen atoms from the same carbon atom, thefree valiances of which are part of a double bond (e.g. (CH₃)₂C, alsoknown as propan-2-ylidene).

[0076] As used herein, an “isolated nucleic acid molecule” is a polymerof RNA or DNA that is single- or double-stranded, optionally containingsynthetic, non-natural or altered nucleotide bases. An isolated nucleicacid fragment in the form of a polymer of DNA may be comprised of one ormore segments of cDNA, genomic DNA or synthetic DNA.

[0077] A nucleic acid molecule is “hybridizable” to another nucleic acidmolecule, such as a cDNA, genomic DNA, or RNA, when a single strandedform of the nucleic acid molecule can anneal to the other nucleic acidmolecule under the appropriate conditions of temperature and solutionionic strength. Hybridization and washing conditions are well known andexemplified in Sambrook, J., Fritsch, E. F. and Maniatis, T. MolecularCloning: A Laboratory Manual, Second Edition, Cold Spring HarborLaboratory Press, Cold Spring Harbor (1989), particularly Chapter 11 andTable 11.1 therein (entirely incorporated herein by reference). Theconditions of temperature and ionic strength determine the “stringency”of the hybridization. Stringency conditions can be adjusted to screenfor moderately similar fragments, such as homologous sequences fromdistantly related organisms, to highly similar fragments, such as genesthat duplicate functional enzymes from closely related organisms.Typical stringent hybridization conditions are for example,hybridization at 0.1×SSC, 0.1% SDS, 65° C. with a wash with 2×SSC, 0.1%SDS followed by 0.1×SSC, 0.1% SDS. Generally post-hybridization washesdetermine stringency conditions. One set of preferred conditions uses aseries of washes starting with 6×SSC, 0.5% SDS at room temperature for15 min, then repeated with 2×SSC, 0.5% SDS at 45° C. for 30 min, andthen repeated twice with 0.2×SSC, 0.5% SDS at 50° C. for 30 min. A morepreferred set of stringent conditions uses higher temperatures in whichthe washes are identical to those above except for the temperature ofthe final two 30 min washes in 0.2×SSC, 0.5% SDS was increased to 60° C.Another preferred set of highly stringent conditions uses two finalwashes in 0.1×SSC, 0.1% SDS at 65° C. Hybridization requires that thetwo nucleic acids contain complementary sequences, although depending onthe stringency of the hybridization, mismatches between bases arepossible. The appropriate stringency for hybridizing nucleic acidsdepends on the length of the nucleic acids and the degree ofcomplementation, variables well known in the art. The greater the degreeof similarity or homology between two nucleotide sequences, the greaterthe value of Tm for hybrids of nucleic acids having those sequences. Therelative stability (corresponding to higher Tm) of nucleic acidhybridizations decreases in the following order: RNA:RNA, DNA:RNA,DNA:DNA. For hybrids of greater than 100 nucleotides in length,equations for calculating Tm have been derived (see Sambrook et al.,supra, 9.50-9.51). For hybridizations with shorter nucleic acids, i.e.,oligonucleotides, the position of mismatches becomes more important, andthe length of the oligonucleotide determines its specificity (seeSambrook et al., supra, 11.7-11.8). In one embodiment the length for ahybridizable nucleic acid is at least about 10 nucleotides. Preferable aminimum length for a hybridizable nucleic acid is at least about 15nucleotides; more preferably at least about 20 nucleotides; and mostpreferably the length is at least 30 nucleotides. Furthermore, theskilled artisan will recognize that the temperature and wash solutionsalt concentration may be adjusted as necessary according to factorssuch as length of the probe.

[0078] The term “complementary” is used to describe the relationshipbetween nucleotide bases that are capable to hybridizing to one another.For example, with respect to DNA, adenosine is complementary to thymineand cytosine is complementary to guanine. Accordingly, the instantinvention also includes isolated nucleic acid fragments that arecomplementary to the complete sequences as reported in the accompanyingSequence Listing as well as those substantially similar nucleic acidsequences.

[0079] The term “percent identity”, as known in the art, is arelationship between two or more polypeptide sequences or two or morepolynucleotide sequences, as determined by comparing the sequences. Inthe art, “identity” also means the degree of sequence relatednessbetween polypeptide or polynucleotide sequences, as the case may be, asdetermined by the match between strings of such sequences. “Identity”and “similarity” can be readily calculated by known methods, includingbut not limited to those described in: Computational Molecular Biology(Lesk, A. M., ed.) Oxford University Press, New York (1988);Biocomputing: Informatics and Genome Projects (Smith, D. W., ed.)Academic Press, New York (1993); Computer Analysis of Sequence Data.Part I (Griffin, A. M., and Griffin, H. G., eds.) Humana Press, NewJersey (1994); Sequence Analysis in Molecular Biology (von Heinje, G.,ed.) Academic Press (1987); and Sequence Analysis Primer (Gribskov, M.and Devereux, J., eds.) Stockton Press, New York (1991). Preferredmethods to determine identity are designed to give the best matchbetween the sequences tested. Methods to determine identity andsimilarity are codified in publicly available computer programs.Sequence alignments and percent identity calculations may be performedusing the Megalign program of the LASERGENE bioinformatics computingsuite (DNASTAR Inc., Madison, Wis.). Multiple alignment of the sequenceswas performed using the Clustal method of alignment (Higgins and Sharp(1989) CABIOS. 5:151-153) with the default parameters (GAP PENALTY=10,GAP LENGTH PENALTY=10). Default parameters for pairwise alignments usingthe Clustal method were KTUPLE 1, GAP PENALTY=3, WINDOW=5 and DIAGONALSSAVED=5.

[0080] Suitable nucleic acid fragments (isolated polynucleotides of thepresent invention) encode polypeptides that are at least about 70%identical, preferably at least about 80% identical to the amino acidsequences reported herein. Preferred nucleic acid fragments encode aminoacid sequences that are about 85% identical to the amino acid sequencesreported herein. More preferred nucleic acid fragments encode amino acidsequences that are at least about 90% identical to the amino acidsequences reported herein. Most preferred are nucleic acid fragmentsthat encode amino acid sequences that are at least about 95% identicalto the amino acid sequences reported herein. Suitable nucleic acidfragments not only have the above homologies but typically encode apolypeptide having at least 50 amino acids, preferably at least 100amino acids, more preferably at least 150 amino acids, still morepreferably at least 200 amino acids, and most preferably at least 250amino acids.

[0081] “Codon degeneracy” refers to the nature in the genetic codepermitting variation of the nucleotide sequence without effecting theamino acid sequence of an encoded polypeptide. Accordingly, the instantinvention relates to any nucleic acid fragment that encodes all or asubstantial portion of the amino acid sequence encoding the instantmicrobial polypeptides as set forth in SEQ ID NOs:8, 10, 12, 14, 16, 18,20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, and 46. The skilledartisan is well aware of the “codon-bias” exhibited by a specific hostcell in usage of nucleotide codons to specify a given amino acid.Therefore, when synthesizing a gene for improved expression in a hostcell, it is desirable to design the gene such that its frequency ofcodon usage approaches the frequency of preferred codon usage of thehost cell.

[0082] “Synthetic genes” can be assembled from oligonucleotide buildingblocks that are chemically synthesized using procedures known to thoseskilled in the art. These building blocks are ligated and annealed toform gene segments which are then enzymatically assembled to constructthe entire gene. “Chemically synthesized”, as related to a sequence ofDNA, means that the component nucleotides were assembled in vitro.Manual chemical synthesis of DNA may be accomplished using wellestablished procedures, or automated chemical synthesis can be performedusing one of a number of commercially available machines. Accordingly,the genes can be tailored for optimal gene expression based onoptimization of nucleotide sequence to reflect the codon bias of thehost cell. The skilled artisan appreciates the likelihood of successfulgene expression if codon usage is biased towards those codons favored bythe host. Determination of preferred codons can be based on a survey ofgenes derived from the host cell where sequence information isavailable.

[0083] “Gene” refers to a nucleic acid fragment that expresses aspecific protein, including regulatory sequences preceding (5′non-coding sequences) and following (3′ non-coding sequences) the codingsequence. “Native gene” refers to a gene as found in nature with its ownregulatory sequences. “Chimeric gene” refers to any gene that is not anative gene, comprising regulatory and coding sequences that are notfound together in nature. Accordingly, a chimeric gene may compriseregulatory sequences and coding sequences that are derived fromdifferent sources, or regulatory sequences and coding sequences derivedfrom the same source, but arranged in a manner different than that foundin nature. “Endogenous gene” refers to a native gene in its naturallocation in the genome of an organism. A “foreign” gene refers to a genenot normally found in the host organism, but that is introduced into thehost organism by gene transfer. Foreign genes can comprise native genesinserted into a non-native organism, or chimeric genes. A “transgene” isa gene that has been introduced into the genome by a transformationprocedure.

[0084] “Coding sequence” refers to a DNA sequence that codes for aspecific amino acid sequence. “Suitable regulatory sequences” refer tonucleotide sequences located upstream (5′ non-coding sequences), within,or downstream (3′ non-coding sequences) of a coding sequence, and whichinfluence the transcription, RNA processing or stability, or translationof the associated coding sequence. Regulatory sequences may includepromoters, translation leader sequences, introns, polyadenylationrecognition sequences, RNA processing site, effector binding site andstem-loop structures.

[0085] “Promoter” refers to a DNA sequence capable of controlling theexpression of a coding sequence or functional RNA. In general, a codingsequence is located 3′ to a promoter sequence. Promoters may be derivedin their entirety from a native gene, or be composed of differentelements derived from different promoters found in nature, or evencomprise synthetic DNA segments. It is understood by those skilled inthe art that different promoters may direct the expression of a gene indifferent tissues or cell types, or at different stages of development,or in response to different environmental or physiological conditions.Promoters which cause a gene to be expressed in most cell types at mosttimes are commonly referred to as “constitutive promoters”. It isfurther recognized that since in most cases the exact boundaries ofregulatory sequences have not been completely defined, DNA fragments ofdifferent lengths may have identical promoter activity.

[0086] The “3′ non-coding sequences” refer to DNA sequences locateddownstream of a coding sequence and include polyadenylation recognitionsequences and other sequences encoding regulatory signals capable ofaffecting mRNA processing or gene expression. The polyadenylation signalis usually characterized by affecting the addition of polyadenylic acidtracts to the 3′ end of the mRNA precursor.

[0087] “RNA transcript” refers to the product resulting from RNApolymerase-catalyzed transcription of a DNA sequence. When the RNAtranscript is a perfect complementary copy of the DNA sequence, it isreferred to as the primary transcript or it may be a RNA sequencederived from post-transcriptional processing of the primary transcriptand is referred to as the mature RNA. “Messenger RNA (mRNA)” refers tothe RNA that is without introns and that can be translated into proteinby the cell. “CDNA” refers to a double-stranded DNA that.iscomplementary to and derived from mRNA. “Sense” RNA refers to RNAtranscript that includes the mRNA and so can be translated into proteinby the cell. “Antisense RNA” refers to a RNA transcript that iscomplementary to all or part of a target primary transcript or mRNA andthat blocks the expression of a target gene (U.S. Pat. No. 5,107,065; WO9928508). The complementarity of an antisense RNA may be with any partof the specific gene transcript, i.e., at the 5′ non-coding sequence, 3′non-coding sequence, or the coding sequence. “Functional RNA” refers toantisense RNA, ribozyme RNA, or other RNA that is not translated yet hasan effect on cellular processes.

[0088] The term “operably linked” refers to the association of nucleicacid sequences on a single nucleic acid fragment so that the function ofone is affected by the other. For example, a promoter is operably linkedwith a coding sequence when it is capable of affecting the expression ofthat coding sequence (i.e., that the coding sequence is under thetranscriptional control of the promoter). Coding sequences can beoperably linked to regulatory sequences in sense or antisenseorientation.

[0089] The term “expression”, as used herein, refers to thetranscription and stable accumulation of sense (mRNA) or antisense RNAderived from the nucleic acid fragment of the invention. Expression mayalso refer to translation of mRNA into a polypeptide. “Transformation”refers to the transfer of a nucleic acid fragment into the genome of ahost organism, resulting in genetically stable inheritance. Hostorganisms containing the transformed nucleic acid fragments are referredto as “transgenic” or “recombinant” or “transformed” organisms. Theterms “plasmid”, “vector” and “cassette” refer to an extra chromosomalelement often carrying genes which are not part of the centralmetabolism of the cell, and usually in the form of circulardouble-stranded DNA molecules. Such elements may be autonomouslyreplicating sequences, genome integrating sequences, phage or nucleotidesequences, linear or circular, of a single- or double-stranded DNA orRNA, derived from any source, in which a number of nucleotide sequenceshave been joined or recombined into a unique construction which iscapable of introducing a promoter fragment and DNA sequence for aselected gene product along with appropriate 3′ untranslated sequenceinto a cell. “Transformation cassette” refers to a specific vectorcontaining a foreign gene and having elements in addition to the foreigngene that facilitate transformation of a particular host cell.“Expression cassette” refers to a specific vector containing a foreigngene and having elements in addition to the foreign gene that allow forenhanced expression of that gene in a foreign host.

[0090] The term “sequence analysis software” refers to any computeralgorithm or software program that is useful for the analysis ofnucleotide or amino acid sequences. “Sequence analysis software” may becommercially available or independently developed. Typical sequenceanalysis software will include but is not limited to the GCG suite ofprograms (Wisconsin Package Version 9.0, Genetics Computer Group (GCG),Madison, WI), BLASTP, BLASTN, BLASTX (Altschul et al., J. Mol. Biol.215:403410 (1990), and DNASTAR (DNASTAR, Inc. 1228 S. Park St. Madison,Wis. 53715 USA), and the FASTA program incorporating the Smith-Watermanalgorithm (W. R. Pearson, Comput. Methods Genome Res., [Proc. Int.Symp.] (1994), Meeting Date 1992,111-20. Editor(s): Suhai, Sandor.Publisher: Plenum, New York, N.Y.). Within the context of thisapplication it will be understood that where sequence analysis softwareis used for analysis, that the results of the analysis will be based onthe “default values” of the program referenced, unless otherwisespecified. As used herein “default values” will mean any set of valuesor parameters which originally load with the software when firstinitialized.

[0091] The term “signature sequence” means a set of amino acidsconserved at specific positions along an aligned sequence ofevolutionarily related proteins. While amino acids at other positionscan vary between homologous proteins, amino acids which are highlyconserved at specific positions indicate amino acids which are essentialin the structure, the stability, or the activity of a protein. Becausethey are identified by their high degree of conservation in alignedsequences of a family of protein homologues, they can be used asidentifiers, or “signatures”, to determine if a protein with a newlydetermined sequence belongs to a previously identified protein family.Signature sequences of the present invention are specifically describedFIG. 6 showing the signature sequence comprised of p1-p74 of SEQ IDNO:47, p1-p76 of SEQ ID NO:48 and p1-p41 of SEQ ID NO:49.

[0092] Standard recombinant DNA and molecular cloning techniques usedhere are well known in the art and are described by Sambrook, J.,Fritsch, E. F. and Maniatis, T., Molecular Cloning: A Laboratory Manual,Second Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor,N.Y. (1989) (hereinafter “Maniatis”); and by Silhavy, T. J., Bennan, M.L. and Enquist, L. W., Experiments with Gene Fusions, Cold Spring HarborLaboratory Cold Press Spring Harbor, N.Y. (1984); and by Ausubel, F. M.et al., Current Protocols in Molecular Biology, published by GreenePublishing Assoc. and Wiley-Interscience (1987).

[0093] Isolation Of Microorganisms Having Baever-Villiger MonooxygenaseActivity

[0094] Microorganisms having Baeyer-Villiger monooxygenase activity maybe isolated from a variety of sources. Suitable sources includeindustrial waste streams, soil from contaminated industrial sites andwaste stream treatment facilities. The Baeyer-Villiger monooxygenasecontaining microorganisms of the instant invention were isolated fromactivated sludge from waste water treatment plants.

[0095] Samples suspected of containing a microorganism havingBaeyer-Villiger monooxygenase activity may be enriched by incubation ina suitable growth medium in combination with at least one ketonesubstrate. Suitable ketone substrates for use in the instant inventioninclude cyclic ketones and ketoterpenes having the general formula:

[0096] wherein R and R₁ are independently selected from substituted orunsubstituted phenyl, substituted or unsubstituted alkyl, or substitutedor unsubstituted alkenyl or substituted or unsubstituted alkylidene.These compounds may be synthetic or natural secondary metabolites

[0097] Particularly useful ketone substrates include, but are notlimited to Norcamphor, Cyclobutanone, Cyclopentanone,2-methyl-cyclopentanone, Cyclohexanone, 2-methyl-cyclohexanone,Cyclohex-2-ene-1-one, 1,2-cyclohexanedione, 1,3-cyclohexanedione,1,4-cyclohexanedione, Cycloheptanone, Cyclooctanone, Cyclodecanone,Cycloundecanone, Cyclododecanone, Cyclotridecanone, Cyclopenta-decanone,2-tridecanone, dihexyl ketone, 2-phenyl-cyclohexanone, Oxindole,Levoglucosenone, dimethyl sulfoxide, dimethy-2-piperidone, Phenylboronicacid, and beta-ionone.Growth medium and techniques needed in theenrichment and screening of microorganisms are well known in the art andexamples may be found in Manual of Methods for General Bacteriology(Phillipp Gerhardt, R. G. E. Murray, Ralph N. Costilow, Eugene W.Nester, Willis A. Wood, Noel R. Krieg and G. Briggs Phillips, eds),American Society for Microbiology, Washington, D.C. (1994)); or byThomas D. Brock in Biotechnology: A Textbook of Industrial Microbiology,Second Edition, Sinauer Associates, Inc., Sunderland, Mass. (1989).

[0098] Characterization of the Baeyer-Villiger Monooxygenase ContainingMicroorganisms:

[0099] The sequence of the small subunit ribosomal RNA or DNA (16S rDNA)is frequently used for taxonomic identification of novel bacterial.Currently, more than 7,000 bacterial 16S rDNA sequences are nowavailable. Highly conserved regions of the 16S rDNA provide primingsites for broad-range polymerase chain reaction (PCR) (or RT-PCR) andobviate the need for specific information about a targeted microorganismbefore this procedure. This permits identification of a previouslyuncharacterized bacterium by broad range bacterial 16S rDNAamplification, sequencing, and phylogenetic analysis.

[0100] This invention describes the isolation and identification of 7different bacteria based on their taxonomic identification followingamplification of the 16S rDNA using primers corresponding to conserved15. regions of the 16S rDNA molecule (Amann, R. I. et al. Microbiol.Rev. 59(1):143-69 (1995); Kane, M. D. et al. Appl. Environ. Microbiol.59:682-686 (1993)), followed by sequencing and BLAST analysis (BasicLocal Alignment Search Tool; Altschul, S. F., et al., J. Mol. Biol.215:403-410 (1993); see also www.ncbi.nlm.nih.gov/BLAST/). Bacterialstrains were identified as highly homologous to bacteria of the generaBrevibacterium, Arthrobacter, Acinetobacter, Acidovorax, andRhodococcus.

[0101] Comparison of the 16S rRNA nucleotide base sequence from strainAN 12 to public databases reveals that the most similar known sequences(98% homologous) are the 16S rRNA gene sequences of bacteria belongingto the genus Rhodococcus.

[0102] Comparison of the 16S rRNA nucleotide base sequence from strainCHX to public databases reveals that the most similar known sequences(97% homologous) are the 16S rRNA gene sequences of bacteria of thegenus Acidovorax.

[0103] Comparison of the 16S rRNA nucleotide base sequence from strainBP2 to public databases reveals that the most similar known sequences(99% homologous) are the 16S rRNA gene sequences of bacteria of thegenus Arthrobacter. Comparison of the 16S rRNA nucleotide base sequencefrom strain SE19 to public databases reveals that the most similar knownsequences (99% homologous) are the 16S rRNA gene sequences of bacteriaof the genus Acinetobacter.

[0104] Comparison of the 16S rRNA nucleotide base sequence from strainsphi1 and phi2 to public databases reveals that the most similar knownsequences (99% homologous) are the 16S rRNA gene sequences of bacteriabelonging to the genus Rhodococcus.

[0105] Identification of Baever-Villiger Monooxygenase Homologs

[0106] The present invention provides examples of Baeyer-Villigermonooxygenase genes and gene products having the ability to convertsuitable ketone substrates comprising cyclic ketones and ketoterpenes tothe corresponding lactone or ester. For example, genes encoding BVMO'shave been isolated from Arthrobacter (SEQ ID NO:1 1), Brevibacterum (SEQID NOs:13 and 15), Acidovorax (SEQ ID NO:17), Acinetobacter (SEQ IDNO:19), and Rhodococcus (SEQ ID NOs:7, 9, 21, 23, 25, 27, 29, 31, 33,35, 37, 39, 41, 43, and 45).

[0107] Comparison of the Arthrobacter sp. BP2 chnB nucleotide base anddeduced amino acid sequences to public databases reveals that the mostsimilar known sequences range from a distant as about 57% identical tothe amino acid sequence of reported herein over length of 532 aminoacids using a Smith-Waterman alignment algorithm (W. R. Pearson, supra).Preferred amino acid fragments are at least about 70%-80% and morepreferred amino acid fragments are at least about 80%-90% identical tothe sequences herein. Most preferred are nucleic acid fragments that areat least 95% identical to the amino acid fragments reported herein.Similarly, preferred chnB encoding nucleic acid sequences correspondingto the instant ORF's are those encoding active proteins and which are atleast 80% identical to the nucleic acid sequences reported herein. Morepreferred chnB nucleic acid fragments are at least 90% identical to thesequences herein. Most preferred are chnB nucleic acid fragments thatare at least 95% identical to the nucleic acid fragments reportedherein.

[0108] Comparison of the Acidovorax sp. CHX chnB nucleotide base anddeduced amino acid sequences to public databases reveals that the mostsimilar known sequences range from a distant as about 57% identical tothe amino acid sequence of reported herein over length of 538 aminoacids using a Smith-Waterman alignment algorithm (W. R. Pearson, supra).Preferred amino acid fragments are at least about 70%-80% and morepreferred amino acid fragments are at least about 80%-90% identical tothe sequences herein. Most preferred are nucleic acid fragments that areat least 95% identical to the amino acid fragments reported herein.Similarly, preferred chnB encoding nucleic acid sequences correspondingto the instant ORF's are those encoding active proteins and which are atleast 80% identical to the nucleic acid sequences reported herein. Morepreferred chnB nucleic acid fragments are at least 90% identical to thesequences herein. Most preferred are chnB nucleic acid fragments thatare at least 95% identical to the nucleic acid fragments reportedherein.

[0109] Comparison of the Rhodococcus sp. phi1 chnB nucleotide base anddeduced amino acid sequences to public databases reveals that the mostsimilar known sequences range from a distant as about 55% identical tothe amino acid sequence of reported herein over length of 542 aminoacids using a Smith-Waterman alignment algorithm (W. R.

[0110] Pearson, supra). Preferred amino acid fragments are at leastabout 70%-80% and more preferred amino acid fragments are at least about80%-90% identical to the sequences herein. Most preferred are nucleicacid fragments that are at least 95% identical to the amino acidfragments reported herein. Similarly, preferred chnB encoding nucleicacid sequences corresponding to the instant ORF's are those encodingactive proteins and which are at least 80% identical to the nucleic acidsequences reported herein. More preferred chnB nucleic acid fragmentsare at least 90% identical to the sequences herein. Most preferred arechnB nucleic acid fragments that are at least 95% identical to thenucleic acid fragments reported herein.

[0111] Comparison of the Rhodococcus sp. phi2 chnB nucleotide base anddeduced amino acid sequences to public databases reveals that the mostsimilar known sequences range from a distant as about 53% identical tothe amino acid sequence of reported herein over length of 541 aminoacids using a Smith-Waterman alignment algorithm (W. R.

[0112] Pearson, supra). Preferred amino acid fragments are at leastabout 70%-80% and more preferred amino acid fragments are at least about80%-90% identical to the sequences herein. Most preferred are nucleicacid fragments that are at least 95% identical to the amino acidfragments reported herein. Similarly, preferred chnB encoding nucleicacid sequences corresponding to the instant ORF's are those encodingactive proteins and which are at least 80% identical to the nucleic acidsequences reported herein. More preferred chnB nucleic acid fragmentsare at least 90% identical to the sequences herein. Most preferred arechnB nucleic acid fragments that are at least 95% identical to thenucleic acid fragments reported herein.

[0113] Comparison of the Rhodococcus erythropolis AN12 ORF8 chnBnucleotide base and deduced amino acid sequences to public databasesreveals that the most similar known sequences range from a distant asabout 37% identical to the amino acid sequence of reported herein overlength of 439 amino acids using a Smith-Waterman alignment algorithm (W.R. Pearson, supra). Preferred amino acid fragments are at least about70%-80% and more preferred amino acid fragments are at least about80%-90% identical to the sequences herein. Most preferred are nucleicacid fragments that are at least 95% identical to the amino acidfragments reported herein. Similarly, preferred chnB encoding nucleicacid sequences corresponding to the instant ORF's are those encodingactive proteins and which are at least 80% identical to the nucleic acidsequences reported herein. More preferred chnB nucleic acid fragmentsare at least 90% identical to the sequences herein. Most preferred arechnB nucleic acid fragments that are at least 95% identical to thenucleic acid fragments reported herein.

[0114] Comparison of the Rhodococcus erythropolis AN1 ORF9 chnBnucleotide base and deduced amino acid sequences to public databasesreveals that the most similar known sequences range from a distant asabout 44% identical to the amino acid sequence of reported herein overlength of 518 amino acids using a Smith-Waterman alignment algorithm (W.R. Pearson, supra). Preferred amino acid fragments are at least about70%-80% and more preferred amino acid fragments are at least about80%-90% identical to the sequences herein. Most preferred are nucleicacid fragments that are at least 95% identical to the amino acidfragments reported herein. Similarly, preferred chnB encoding nucleicacid sequences corresponding to the instant ORF's are those encodingactive proteins and which are at least 80% identical to the nucleic acidsequences reported herein. More preferred chnB nucleic acid fragmentsare at least 90% identical to the sequences herein. Most preferred arechnB nucleic acid fragments that are at least 95% identical to thenucleic acid fragments reported herein.

[0115] Comparison of the Rhodococcus erythropolis AN1 ORF10 chnBnucleotide base and deduced amino acid sequences to public databasesreveals that the most similar known sequences range from a distant asabout 64% identical to the amino acid sequence of reported herein overlength of 541 amino acids using a Smith-Waterrnan alignment algorithm(W. R. Pearson, supra). Preferred amino acid fragments are at leastabout 70%-80% and more preferred amino acid fragments are at least about80%-90% identical to the sequences herein. Most preferred are nucleicacid fragments that are at least 95% identical to the amino acidfragments reported herein. Similarly, preferred chnB encoding nucleicacid sequences corresponding to the instant ORF's are those encodingactive proteins and which are at least 80% identical to the nucleic acidsequences reported herein. More preferred chnB nucleic acid fragmentsare at least 90% identical to the sequences herein. Most preferred arechnB nucleic acid fragments that are at least 95% identical to thenucleic acid fragments reported herein.

[0116] Comparison of the Rhodococcus erythropolis AN1 ORF11 chnBnucleotide base and deduced amino acid sequences to public databasesreveals that the most similar known sequences range from a distant asabout 65% identical to the amino acid sequence of reported herein overlength of 462 amino acids using a Smith-Waterman alignment algorithm (W.R. Pearson, supra). Preferred amino acid fragments are at least about70%-80% and more preferred amino acid fragments are at least about80%-90% identical to the sequences herein. Most preferred are nucleicacid fragments that are at least 95% identical to the amino acidfragments reported herein. Similarly, preferred chnB encoding nucleicacid sequences corresponding to the instant ORF's are those encodingactive proteins and which are at least 80% identical to the nucleic acidsequences reported herein. More preferred chnB nucleic acid fragmentsare at least 90% identical to the sequences herein. Most preferred arechnB nucleic acid fragments that are at least 95% identical to thenucleic acid fragments reported herein.

[0117] Comparison of the Rhodococcus erythropolis AN1 ORF12 chnBnucleotide base and deduced amino acid sequences to public databasesreveals that the most similar known sequences range from a distant asabout 45% identical to the amino acid sequence of reported herein overlength of 523 amino acids using a Smith-Waterman alignment algorithm (W.R. Pearson, supra). Preferred amino acid fragments are at least about70%-80% and more preferred amino acid fragments are at least about80%-90% identical to the sequences herein. Most preferred are nucleicacid fragments that are at least 95% identical to the amino acidfragments reported herein. Similarly, preferred chnB encoding nucleicacid sequences corresponding to the instant ORF's are those encodingactive proteins and which are at least 80% identical to the nucleic acidsequences reported herein. More preferred chnB nucleic acid fragmentsare at least 90% identical to the sequences herein. Most preferred arechnB nucleic acid fragments that are at least 95% identical to thenucleic acid fragments reported herein.

[0118] Comparison of the Rhodococcus erythropolis AN1 ORF13 chnBnucleotide base and deduced amino acid sequences to public databasesreveals that the most similar known sequences range from a distant asabout 55% identical to the amino acid sequence of reported herein overlength of 493 amino acids using a Smith-Waterman alignment algorithm (W.R. Pearson, supra). Preferred amino acid fragments are at least about70%-80% and more preferred amino acid fragments are at least about80%-90% identical to the sequences herein. Most preferred are nucleicacid fragments that are at least 95% identical to the amino acidfragments reported herein. Similarly, preferred chnB encoding nucleicacid sequences corresponding to the instant ORF's are those encodingactive proteins and which are at least 80% identical to the nucleic acidsequences reported herein. More preferred chnB nucleic acid fragmentsare at least 90% identical to the sequences herein. Most preferred arechnB nucleic acid fragments that are at least 95% identical to thenucleic acid fragments reported herein.

[0119] Comparison of the Rhodococcus erythropolis AN1 ORF14 chnBnucleotide base and deduced amino acid sequences to public databasesreveals that the most similar known sequences range from a distant asabout 51% identical to the amino acid sequence of reported herein overlength of 539 amino acids using a Smith-Waterman alignment algorithm (W.R. Pearson, supra). Preferred amino acid fragments are at least about70%-80% and more preferred amino acid fragments are at least about80%-90% identical to the sequences herein. Most preferred are nucleicacid fragments that are at least 95% identical to the amino acidfragments reported herein. Similarly, preferred chnB encoding nucleicacid sequences corresponding to the instant ORF's are those encodingactive proteins and which are at least 80% identical to the nucleic acidsequences reported herein. More preferred chnB nucleic acid fragmentsare at least 90% identical to the sequences herein. Most preferred arechnB nucleic acid fragments that are at least 95% identical to thenucleic acid fragments reported herein.

[0120] Comparison of the Rhodococcus erythropolis AN1 ORF15 chnBnucleotide base and deduced amino acid sequences to public databasesreveals that the most similar known sequences range from a distant asabout 39% identical to the amino acid sequence of reported herein overlength of 649 amino acids using a Smith-Waterman alignment algorithm (W.R. Pearson, supra). Preferred amino acid fragments are at least about70%-80% and more preferred amino acid fragments are at least about80%-90% identical to the sequences herein. Most preferred are nucleicacid fragments that are at least 95% identical to the amino acidfragments reported herein. Similarly, preferred chnB encoding nucleicacid sequences corresponding to the instant ORF's are those encodingactive proteins and which are at least 80% identical to the nucleic acidsequences reported herein. More preferred chnB nucleic acid fragmentsare at least 90% identical to the sequences herein. Most preferred arechnB nucleic acid fragments that are at least 95% identical to thenucleic acid fragments reported herein.

[0121] Comparison of the Rhodococcus erythropolis AN1 ORF16 chnBnucleotide base and deduced amino acid sequences to public databasesreveals that the most similar known sequences range from a distant asabout 43% identical to the amino acid sequence of reported herein overlength of 494 amino acids using a Smith-Waterman alignment algorithm (W.R. Pearson, supra). Preferred amino acid fragments are at least about70%-80% and more preferred amino acid fragments are at least about80%-90% identical to the sequences herein. Most preferred are nucleicacid fragments that are at least 95% identical to the amino acidfragments reported herein. Similarly, preferred chnB encoding nucleicacid sequences corresponding to the instant ORF's are those encodingactive proteins and which are at least 80% identical to the nucleic acidsequences reported herein. More preferred chnB nucleic acid fragmentsare at least 90% identical to the sequences herein. Most preferred arechnB nucleic acid fragments that are at least 95% identical to thenucleic acid fragments reported herein.

[0122] Comparison of the Rhodococcus erythropolis AN1 ORF17 chnBnucleotide base and deduced amino acid sequences to public databasesreveals that the most similar known sequences range from a distant asabout 53% identical to the amino acid sequence of reported herein overlength of 499 amino acids using a Smith-Waterman alignment algorithm (W.R. Pearson, supra). Preferred amino acid fragments are at least about70%-80% and more preferred amino acid fragments are at least about80%-90% identical to the sequences herein. Most preferred are nucleicacid fragments that are at least 95% identical to the amino acidfragments reported herein. Similarly, preferred chnB encoding nucleicacid sequences corresponding to the instant ORF's are those encodingactive proteins and which are at least 80% identical to the nucleic acidsequences reported herein. More preferred chnB nucleic acid fragmentsare at least 90% identical to the sequences herein. Most preferred arechnB nucleic acid fragments that are at least 95% identical to thenucleic acid fragments reported herein.

[0123] Comparison of the Rhodococcus erythropolis AN1 ORF18 chnBnucleotide base and deduced amino acid sequences to public databasesreveals that the most similar known sequences range from a distant asabout 44% identical to the amino acid sequence of reported herein overlength of 493 amino acids using a Smith-Waterman alignment algorithm (W.R. Pearson, supra). Preferred amino acid fragments are at least about70%-80% and more preferred amino acid fragments are at least about80%-90% identical to the sequences herein. Most preferred are nucleicacid fragments that are at least 95% identical to the amino acidfragments reported herein. Similarly, preferred chnB encoding nucleicacid sequences corresponding to the instant ORF's are those encodingactive proteins and which are at least 80% identical to the nucleic acidsequences reported herein. More preferred chnB nucleic acid fragmentsare at least 90% identical to the sequences herein. Most preferred arechnB nucleic acid fragments that are at least 95% identical to thenucleic acid fragments reported herein.

[0124] Comparison of the Rhodococcus erythropolis AN1 ORF19 chnBnucleotide base and deduced amino acid sequences to public databasesreveals that the most similar known sequences range from a distant asabout 54% identical to the amino acid sequence of reported herein overlength of 541 amino acids using a Smith-Waterman alignment algorithm (W.R. Pearson, supra). Preferred amino acid fragments are at least about70%-80% and more preferred amino acid fragments are at least about80%-90% identical to the sequences herein. Most preferred are nucleicacid fragments that are at least 95% identical to the amino acidfragments reported herein. Similarly, preferred chnB encoding nucleicacid sequences corresponding to the instant ORF's are those encodingactive proteins and which are at least 80% identical to the nucleic acidsequences reported herein. More preferred chnB nucleic acid fragmentsare at least 90% identical to the sequences herein. Most preferred arechnB nucleic acid fragments that are at least 95% identical to thenucleic acid fragments reported herein.

[0125] Comparison of the Rhodococcus erythropolis AN1 ORF20 chnBnucleotide base and deduced amino acid sequences to public databasesreveals that the most similar known sequences range from a distant asabout 42% identical to the amino acid sequence of reported herein overlength of 545 amino acids using a Smith-Waterman alignment algorithm (W.R. Pearson, supra). Preferred amino acid fragments are at least about70%-80% and more preferred amino acid fragments are at least about80%-90% identical to the sequences herein. Most preferred are nucleicacid fragments that are at least 95% identical to the amino acidfragments reported herein. Similarly, preferred chnB encoding nucleicacid sequences corresponding to the instant ORF's are those encodingactive proteins and which are at least 80% identical to the nucleic acidsequences reported herein. More preferred chnB nucleic acid fragmentsare at least 90% identical to the sequences herein. Most preferred arechnB nucleic acid fragments that are at least 95% identical to thenucleic acid fragments reported herein.

[0126] In addition to the identification of the above mentionedsequences and the biochemical characterization of the activity of thegene product, Applicants have made the discovery that many of thesemonooxygenase proteins share diagnostic signature sequences which may beused for the identification of other proteins having similar activity.For example, the present monooxygenases may be grouped into threegeneral families based on sequence alignment. One group, referred toherein BV Family 1, is comprised of the monooxygenase sequences shown inFIG. 7 and generating the consensus sequence as set forth in SEQ IDNO:47. As will be seen in FIG. 7, there are a group of completelyconserved amino acids in 74 positions across all of the sequences ofFIG. 7. These positions are further delineated in FIG. 6, and indicatedas p1-p74.

[0127] Similarly, BV Family 2 is comprised of the monooxygenasesequences shown on FIG. 8, and generating the consensus sequence as setforth in SEQ ID NO:48. The signature seqeunce of BV Family 2monooxygenases is shown in FIG. 6 having the positions p1-p76. BV Family3 monooxygenases are shown in FIG. 9, generating the consensus sequenceas set for the in SEQ ID NO:49, having the signature sequence as shownin FIG. 6 of positions p1-p41.

[0128] Although there is variation among the sequences of the variousfamilies, all of the individual members of these families have beenshown to possess monooxygenase activity. Thus, it is contemplated thatwhere a polypeptide possesses the signature sequences as defined inFIGS. 6-9 that it will have monooxygenase activity. It is thus withinthe scope of the present invention to provide a method for identifying agene encoding a Baeyer-Villiger monooxygenase polypeptide comprising:

[0129] (a) probing a genomic library with a nucleic acid fragmentencoding a polypeptide wherein where at least 80% of the amino acidresidues at positions p1-p74 of SEQ ID NO:47, or at least 80% of theamino acid residues at p1-p76 of SEQ ID NO:48 or at least 80% of theamino acid residues of p1-p41 of SEQ ID NO:49 are completely conserved;

[0130] (b) identifying a DNA clone that hybridizes with a nucleic acidfragment of step (a);

[0131] (c) sequencing the genomic fragment that comprises the cloneidentified in step (b),

[0132] wherein the sequenced genomic fragment encodes a Baeyer-Villigermonooxygenase polypeptide.

[0133] In a preferred embodiment the invention provides the above methodwherein where at least 100% of the amino acid residues at positionsp1-p74 of SEQ ID NO:47, or at least 100% of the amino acid residues atp1-p76 of SEQ ID NO:48 or at least 100% of the amino acid residues ofp1-p41 of SEQ ID NO:49 are completely conserved.

[0134] It will be appreciated that other Baeyer-Villiger monooxygenasegenes having similar substrate specificity may be identified andisolated on the basis of sequence dependent protocols or according toalignment against the signature sequences disclosed herein.

[0135] Isolation of homologous genes using sequence-dependent protocolsis well known in the art. Examples of sequence-dependent protocolsinclude, but are not limited to, methods of nucleic acid hybridization,and methods of DNA and RNA amplification as exemplified by various usesof nucleic acid amplification technologies (e.g polymerase chainreaction (PCR), Mullis et al., U.S. Pat. No. 4,683,202), ligase chainreaction (LCR), Tabor, S. et al., Proc. Acad. Sci. USA 82: 1074, (1985))or strand displacement amplification (SDA, Walker, et al., Proc. Natl.Acad. Sci. U.S.A., 89: 392, (1992)).

[0136] For example, genes encoding similar proteins or polypeptides tothe present Baeyer-Villiger monooxygenases could be isolated directly byusing all or a portion of the nucleic acid fragments set forth in SEQ IDNOs:7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39,41, 43, and 45 or as DNA hybridization probes to screen libraries fromany desired bacteria using methodology well known to those skilled inthe art. Specific oligonucleotide probes based upon the instant nucleicacid sequences can be designed and synthesized by methods known in theart (Maniatis, supra). Moreover, the entire sequences can be useddirectly to synthesize DNA probes by methods known to the skilledartisan such as random primers DNA labeling, nick translation, orend-labeling techniques, or RNA probes using available in vitrotranscription systems. In addition, specific primers can be designed andused to amplify a part of or full-length of the instant sequences. Theresulting amplification products can be labeled directly duringamplification reactions or labeled after amplification reactions, andused as probes to isolate full length DNA fragments under conditions ofappropriate stringency.

[0137] Typically, in PCR-type primer directed amplification techniques,the primers have different sequences and are not complementary to eachother. Depending on the desired test conditions, the sequences of theprimers should be designed to provide for both efficient and faithfulreplication of the target nucleic acid. Methods of PCR primer design arecommon and well known in the art. (Thein and Wallace, “The use ofoligonucleotide as specific hybridization probes in the Diagnosis ofGenetic Disorders”, in Human Genetic Diseases: A Practical Approach, K.E. Davis Ed., (1986) pp. 33-50 IRL Press, Hemdon, Va.; Rychlik, W.(1993) In White, B. A. (ed.), Methods in Molecular Biology, Vol. 15,pages 31-39, PCR Protocols: Current Methods and Applications. HumaniaPress, Inc., Totowa, N.J.)

[0138] Generally PCR primers may be used to amplify longer nucleic acidfragments encoding homologous genes from DNA or RNA. However, thepolymerase chain reaction may also be performed on a library of clonednucleic acid fragments wherein the sequence of one primer is derivedfrom the instant nucleic acid fragments, and the sequence of the otherprimer takes advantage of the presence of the polyadenylic acid tractsto the 3′ end of the mRNA precursor encoding microbial genes.Alternatively, the second primer sequence may be based upon sequencesderived from the cloning vector. For example, the skilled artisan canfollow the RACE protocol (Frohman et aL, PNAS USA 85:8998 (1988)) togenerate cDNAs by using PCR to amplify copies of the region between asingle point in the transcript and the 3′ or 5′ end. Primers oriented inthe 3′ and 5′ directions can be designed from the instant sequences.Using commercially available 3′ RACE or 5′ RACE systems (BRL), specific3′ or 5′ cDNA fragments can be isolated (Ohara et al., PNAS USA 86:5673(1989); Loh et al., Science 243:217 (1989)).

[0139] Accordingly the invention provides a method for identifying anucleic acid molecule encoding a Baeyer-Villiger monooxygenasecomprising: (a) synthesizing at least one oligonucleotide primercorresponding to a portion of the sequence selected from the groupconsisting of SEQ ID NOs:7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29,31, 33, 35, 37, 39, 41, 43, and 45 and (b) amplifying an insert presentin a cloning vector using the oligonucleotide primer of step (a);wherein the amplified insert encodes a Baeyer-Villiger monooxygenase

[0140] Alternatively the instant sequences may be employed ashybridization reagents for the identification of homologs. The basiccomponents of a nucleic acid hybridization test include a probe, asample suspected of containing the gene or gene fragment of interest,and a specific hybridization method. Probes of the present invention aretypically single stranded nucleic acid sequences which are complementaryto the nucleic acid sequences to be detected. Probes are “hybridizable”to the nucleic acid sequence to be detected. The probe length can varyfrom bases to tens of thousands of bases, and will depend upon thespecific test to be done. Typically a probe length of about 15 bases toabout 30 bases is suitable. Only part of the probe molecule need becomplementary to the nucleic acid sequence to be detected. In addition,the complementarity between the probe and the target sequence need notbe perfect. Hybridization does occur between imperfectly complementarymolecules with the result that a certain fraction of the bases in thehybridized region are not paired with the proper complementary base.

[0141] Hybridization methods are well defined. Typically the probe andsample must be mixed under conditions which will permit nucleic acidhybridization. This involves contacting the probe and sample in thepresence of an inorganic or organic salt under the proper concentrationand temperature conditions. The probe and sample nucleic acids must bein contact for a long enough time that any possible hybridizationbetween the probe and sample nucleic acid may occur. The concentrationof probe or target in the mixture will determine the time necessary forhybridization to occur. The higher the probe or target concentration theshorter the hybridization incubation time needed. Optionally achaotropic agent may be added. The chaotropic agent stabilizes nucleicacids by inhibiting nuclease activity. Furthermore, the chaotropic agentallows sensitive and stringent hybridization of short oligonucleotideprobes at room temperature [Van Ness and Chen (1991) Nucl. Acids Res.19:5143-5151]. Suitable chaotropic agents include guanidinium chloride,guanidinium thiocyanate, sodium thiocyanate, lithium tetrachloroacetate,sodium perchlorate, rubidium tetrachloroacetate, potassium iodide, andcesium trifluoroacetate, among others. Typically, the chaotropic agentwill be present at a final concentration of about 3M. If desired, onecan add formamide to the hybridization mixture, typically 30-50% (v/v).

[0142] Various hybridization solutions can be employed. Typically, thesecomprise from about 20 to 60% volume, preferably 30%, of a polar organicsolvent. A common hybridization solution employs about 30-50% v/vformamide, about 0.15 to 1M sodium chloride, about 0.05 to 0.1M buffers,such as sodium citrate, Tris-HCI, PIPES or HEPES (pH range about 6-9),about 0.05 to 0.2% detergent, such as sodium dodecylsulfate, or between0.5-20 mM EDTA, FICOLL (Pharmacia Inc.) (about 300-500 kilodaltons),polyvinylpyrrolidone (about 250-500 kdal), and serum albumin. Alsoincluded in the typical hybridization solution will be unlabeled carriernucleic acids from about 0.1 to 5 mg/mL, fragmented nucleic DNA, e.g.,calf thymus or salmon sperm DNA, or yeast RNA, and optionally from about0.5 to 2% wt/vol glycine. Other additives may also be included, such asvolume exclusion agents which include a variety of polar water-solubleor swellable agents, such as polyethylene glycol, anionic polymers suchas polyacrylate or polymethylacrylate, and anionic saccharidic polymers,such as dextran sulfate.

[0143] Thus, the invention provides a method for identifying a nucleicacid molecule encoding a Baeyer-Villiger monooxygenase comprising:(a)probing a genomic library with a portion of a nucleic acid moleculeselected from the group consisting of SEQ ID NOs:7, 9, 11, 13, 15, 17,19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, and 45; (b)identifying a DNA clone that hybridizes under conditions of 0.1×SSC,0.1% SDS, 65° C. and washed with 2×SSC, 0.1% SDS followed by 0.1×SSC,0.1% SDS with the nucleic acid molecule of (a); and (c) sequencing thegenomic fragment that comprises the clone identified in step (b),wherein the sequenced genomic fragment encodes Baeyer-Villigermonooxygenase.

[0144] Recombinant Expression-Microbial

[0145] The genes and gene products of the present BVMO sequences may beintroduced into microbial host cells. Preferred host cells forexpression of the instant genes and nucleic acid molecules are microbialhosts that can be found broadly within the fungal or bacterial familiesand which grow over a wide range of temperature, pH values, and solventtolerances. Because of transcription, translation and the proteinbiosynthetic apparatus is the same irrespective of the cellularfeedstock, functional genes are expressed irrespective of carbonfeedstock used to generate cellular biomass. Large scale microbialgrowth and functional gene expression may utilize a wide range of simpleor complex carbohydrates, organic acids and alcohols, saturatedhydrocarbons such as methane or carbon dioxide in the case ofphotosynthetic or chemoautotrophic hosts. However, the functional genesmay be regulated, repressed or depressed by specific growth conditions,which may include the form and amount of nitrogen, phosphorous, sulfur,oxygen, carbon or any trace micronutrient including small inorganicions. In addition, the regulation of functional genes may be achieved bythe presence or absence of specific regulatory molecules that are addedto the culture and are not typically considered nutrient or energysources. Growth rate may also be an important regulatory factor in geneexpression. Examples of suitable host strains include but are notlimited to fungal or yeast species such as Aspergillus, Trichoderma,Saccharomyces, Pichia, Candida, Hansenula, or bacterial species such asmember of the proteobacteria and actinomycetes as well as the specificgenera Rhodococcus, Acinetobacter, Arthrobacter, Mycobacteria, Nocardia,Brevibactedium, Acidovorax, Bacillus, Streptomyces, Escherichia,Salmonella, Pseudomonas, Aspergillus, Saccharomyces, Pichia, Candida,Comyebactedium, and Hansenula.

[0146] Particularly suitable in the present invention as hosts formonooxygenase are the members of the Proteobacteria and Actinomycetes.The Proteobacteria form a physiologically diverse group ofmicroorganisms and represent five subdivisions (α, β, γ, ε, δ) (Madiganet al., Brock Biology of Microorganisms, 8th edition, Prentice Hall,UpperSaddle River, N.J. (1997)). All five subdivisions of theProteobacteria contain microorganisms that use organic compounds assources of carbon and energy. Members of the Proteobacteria suitable inthe present invention include, but are not limited to Burkholderia,Alcaligenes, Pseudomonas, Sphingomonas, Pandoraea, Delftia andComamonas.

[0147] Microbial expression systems and expression vectors containingregulatory sequences that direct high level expression of foreignproteins are well known to those skilled in the art. Any of these couldbe used to construct chimeric genes for production of the any of thegene products of the instant sequences. These chimeric genes could thenbe introduced into appropriate microorganisms via transformation toprovide high level expression of the enzymes.

[0148] Vectors or cassettes useful for the transformation of suitablehost cells are well known in the art. Typically the vector or cassettecontains sequences directing transcription and translation of therelevant gene, a selectable marker, and sequences allowing autonomousreplication or chromosomal integration. Suitable vectors comprise aregion 5′ of the gene which harbors transcriptional initiation controlsand a region 3′ of the DNA fragment which controls transcriptionaltermination. It is most preferred when both control regions are derivedfrom genes homologous to the transformed host cell, although it is to beunderstood that such control regions need not be derived from the genesnative to the specific species chosen as a production host.

[0149] Initiation control regions or promoters, which are useful todrive expression of the instant ORF's in the desired host cell arenumerous and familiar to those skilled in the art. Virtually anypromoter capable of driving these genes is suitable for the presentinvention including but not limited to CYC1, HIS3, GAL1, GAL10, ADH1,PGK, PHO5, GAPDH, ADC1, TRP1, URA3, LEU2, ENO, TPI (useful forexpression in Saccharomyces); AOX1 (useful for expression in Pichia);and lac, ara, tet, trp, IP_(L), IP_(R), T7, tac, and trc (useful forexpression in Escherichia coli) as well as the amy, apr, npr promotersand various phage promoters useful for expression in Bacillus.

[0150] Termination control regions may also be derived from variousgenes native to the preferred hosts. Optionally, a termination site maybe unnecessary, however, it is most preferred if included.

[0151] Recombinant Expression-Plants

[0152] The sequences encoding the BVMO's of the present invention may beused to create transgenic plants having the ability to express themicrobial proteins. Preferred plant hosts will be any variety that willsupport a high production level of the instant proteins.

[0153] Suitable green plants will included but are not limited to ofsoybean, rapeseed (Brassica napus, B. campestris), sunflower (Helianthusannus), cotton (Gossypium hirsutum), corn, tobacco (Nicotiana tabacum),alfalfa (Medicago sativa), wheat (Triticum sp), barley (Hordeumvulgare), oats (Avena sativa, L), sorghum (Sorghum bicolor), rice (Oryzasativa), Arabidopsis, cruciferous vegetables (broccoli, cauliflower,cabbage, parsnips, etc.), melons, carrots, celery, parsley, tomatoes,potatoes, strawberries, peanuts, grapes, grass seed crops, sugar beets,sugar cane, beans, peas, rye, flax, hardwood trees, softwood trees, andforage grasses. Algal species include but not limited to commerciallysignificant hosts such as Spirulina and Dunalliela. Overexpression ofthe proteins of the instant invention may be accomplished by firstconstructing chimeric genes in which the coding region are operablylinked to promoters capable of directing expression of a gene in thedesired tissues at the desired stage of development. For reasons ofconvenience, the chimeric genes may comprise promoter sequences andtranslation leader sequences derived from the same genes. 3′ Non-codingsequences encoding transcription termination signals must also beprovided. The instant chimeric genes may also comprise one or moreintrons in order to facilitate gene expression.

[0154] Any combination of any promoter and any terminator capable ofinducing expression of a coding region may be used in the chimericgenetic sequence. Some suitable examples of promoters and terminatorsinclude those from nopaline synthase (nos), octopine synthase (ocs) andcauliflower mosaic virus (CaMV) genes. One type of efficient plantpromoter that may be used is a high level plant promoter. Suchpromoters, in operable linkage with the genetic sequences or the presentinvention should be capable of promoting expression of the present geneproduct. High level plant promoters that may be used in this inventioninclude the promoter of the small subunit (ss) of theribulose-1,5-bisphosphate carboxylase from example from soybean(Berry-Lowe et al., J. Molecular and App. Gen., 1:483-498 1982)), andthe promoter of the chlorophyll a/b binding protein. These two promotersare known to be light-induced in plant cells (See, for example, GeneticEngineering of Plants, an Agricultural Perspective, A. Cashmore, Plenum,N.Y. (1983), pages 29-38; Coruzzi, G. et al., The Journal of BiologicalChemistry, 258:1399 (1983), and Dunsmuir, P. et al., Journal ofMolecular and Applied Genetics, 2:285 (1983)).

[0155] Plasmid vectors comprising the instant chimeric genes can then beconstructed. The choice of plasmid vector depends upon the method thatwill be used to transform host plants. The skilled artisan is well awareof the genetic elements that must be present on the plasmid vector inorder to successfully transform, select and propagate host cellscontaining the chimeric gene. The skilled artisan will also recognizethat different independent transformation events will result indifferent levels and patterns of expression (Jones et al., EMBO J.4:2411-2418 (1985); De Almeida et al., Mol. Gen. Genetics 218:78-86(1989)), and thus that multiple events must be screened in order toobtain lines displaying the desired expression level and pattern. Suchscreening may be accomplished by Southern analysis of DNA blots(Southern, J. Mol. Biol. 98:503, (1975)). Northern analysis of mRNAexpression (Kroczek, J. Chromatogr. Biomed. Appl., 618 (1-2):133-145(1993)), Western analysis of protein expression, or phenotypic analysis.

[0156] For some applications it will be useful to direct the instantproteins to different cellular compartments. It is thus envisioned thatthe chimeric genes described above may be further supplemented byaltering the coding sequences to encode enzymes with appropriateintracellular targeting sequences such as transit sequences (Keegstra,K., Cell 56:247-253 (1989)), signal sequences or sequences encodingendoplasmic reticulum localization (Chrispeels, J. J., Ann. Rev. PlantPhys. Plant Mol. Biol. 42:2.1-53 (1991)), or nuclear localizationsignals (Raikhel, N. Plant Phys. 100:1627-1632 (1992)) added and/or withtargeting sequences that are already present removed. While thereferences cited give examples of each of these, the list is notexhaustive and more targeting signals of utility may be discovered inthe future that are useful in the invention.

[0157] Process for the Production of Lactones and Esters from KetoneSubstrates

[0158] Once the appropriate nucleic acid sequence has been expressed ina recombinant organism, the organism may be contacted with a suitableketone substrate for the production of the corresponding ester. TheBaeyer-Villiger monooxygenases of the instant invention will act on avariety of ketone substrates comprising cyclic ketones and ketoterpenesto produce the corresponding lactone or ester. Suitable ketonesubstrates for the conversion to esters are defined by the generalformula:

[0159] wherein R and R₁ are independently selected from substituted orunsubstituted phenyl, substituted or unsubstituted alkyl, or substitutedor unsubstituted alkenyl or substituted or unsubstituted alkylidene.Particularly useful ketone substrates include, but are not limited toNorcamphor, Cyclobutanone, Cyclopentanone, 2-methyl-cyclopentanone,Cyclohexanone, 2-methyl-cyclohexanone, Cyclohex-2-ene-1-one,1,2-cyclohexanedione, 1,3-cyclohexanedione, 1,4-cyclohexanedione,Cycloheptanone, Cyclooctanone, Cyclodecanone, Cycloundecanone,Cyclododecanone, Cyclotridecanone, Cyclopenta-decanone, 2-tridecanone,dihexyl ketone, 2-phenyl-cyclohexanone, Oxindole, Levoglucosenone,dimethyl sulfoxide, dimethy-2-piperidone, Phenylboronic acid, andbeta-ionone.

[0160] Alternatively it is contemplated that the enzymes of theinvention may be used in vitro for the transformation of ketonesubstrates to the corresponding esters. The monooxygenase enzymes may beproduced recombinantly or isoalted from native sources, purified andreacted with the appropriate substrate under suitable conditions of pHand temperature.

[0161] Where large scale commercial production of lactones or esters isdesired, a variety of culture methodologies may be applied. For example,large scale production from a recombinant microbial host may be producedby both batch or continuous culture methodologies.

[0162] A classical batch culturing method is a closed system where thecomposition of the media is set at the beginning of the culture and notsubject to artificial alterations during the culturing process. Thus, atthe beginning of the culturing process the media is inoculated with thedesired organism or organisms and growth or metabolic activity ispermitted to occur adding nothing to the system. Typically, however, a“batch” culture is batch with respect to the addition of carbon sourceand attempts are often made at controlling factors such as pH and oxygenconcentration. In batch systems the metabolite and biomass compositionsof the system change constantly up to the time the culture isterminated. Within batch cultures cells moderate through a static lagphase to a high growth log phase and finally to a stationary phase wheregrowth rate is diminished or halted. If untreated, cells in thestationary phase will eventually die. Cells in log phase are oftenresponsible for the bulk of production of end product or intermediate insome systems. Stationary or post-exponential phase production can beobtained in other systems.

[0163] A variation on the standard batch system is the Fed-Batch system.Fed-Batch culture processes are also suitable in the present inventionand comprise a typical batch system with the exception that thesubstrate is added in increments as the culture progresses. Fed-Batchsystems are useful when catabolite repression is apt to inhibit themetabolism of the cells and where it is desirable to have limitedamounts of substrate in the media. Measurement of the actual substrateconcentration in Fed-Batch systems is difficult and is thereforeestimated on the basis of the changes of measurable factors such as pH,dissolved oxygen and the partial pressure of waste gases such as CO₂.Batch and Fed-Batch culturing methods are common and well known in theart and examples may be found in Thomas D. Brock in Biotechnology: ATextbook of Industrial Microbiology, Second Edition (1989) SinauerAssociates, Inc., Sunderland, Mass., or Deshpande, Mukund V., Appl.Biochem. Biotechnol., 36, 227, (1992), herein incorporated by reference.

[0164] Commercial production of lactones and esters of the presentinvention may also be accomplished with a continuous culture. Continuouscultures are an open system where a defined culture media is addedcontinuously to a bioreactor and an equal amount of conditioned media isremoved simultaneously for processing. Continuous cultures generallymaintain the cells at a constant high liquid phase density where cellsare primarily in log phase growth. Alternatively continuous culture maybe practiced with immobilized cells where carbon and nutrients arecontinuously added, and valuable products, by-products or waste productsare continuously removed from the cell mass. Cell immobilization may beperformed using a wide range of solid supports composed of naturaland/or synthetic materials.

[0165] Continuous or semi-continuous culture allows for the modulationof one factor or any number of factors that affect cell growth or endproduct concentration. For example, one method will maintain a limitingnutrient such as the carbon source or nitrogen level at a fixed rate andallow all other parameters to moderate. In other systems a number offactors affecting growth can be altered continuously while the cellconcentration, measured by media turbidity, is kept constant. Continuoussystems strive to maintain steady state growth conditions and thus thecell loss due to media being drawn off must be balanced against the cellgrowth rate in the culture. Methods of modulating nutrients and growthfactors for continuous culture processes as well as techniques formaximizing the rate of product formation are well known in the art ofindustrial microbiology and a variety of methods are detailed by Brock,supra.

[0166] Baever-Villiper monooxygenases having enhanced activity

[0167] It is contemplated that the present BVMO sequences may be used toproduce gene products having enhanced or altered activity. Variousmethods are known for mutating a native gene sequence to produce a geneproduct with altered or enhanced activity including but not limited toerror prone PCR (Melnikov et al., Nucleic Acids Research, (Feb. 15,1999) Vol. 27, No. 4, pp.1056-1062); site directed mutagenesis (Coombset al., Proteins (1998), 259-311, 1 plate. Editor(s): Angeletti, RuthHogue. Publisher: Academic, San Diego, Calif.) and “gene shuffling”(U.S. Pat. Nos. 5,605,793; 5,811,238; 5,830,721; and 5,837,458,incorporated herein by reference).

[0168] The method of gene shuffling is particularly attractive due toits facile implementation, and high rate of mutagenesis and ease ofscreening. The process of gene shuffling involves the restrictionendonuclease cleavage of a gene of interest into fragments of specificsize in the presence of additional populations of DNA regions of bothsimilarity to or difference to the gene of interest. This pool offragments will then be denatured and reannealed to create a mutatedgene. The mutated gene is then screened for altered activity.

[0169] The BVMO sequences of the present invention may be mutated andscreened for altered or enhanced activity by this method. The sequencesshould be double stranded and can be of various lengths ranging form 50bp to 10 kb. The sequences may be randomly digested into fragmentsranging from about 10 bp to 1000 bp, using restriction endonucleaseswell known in the art (Maniatis supra). In addition to the instantmicrobial sequences, populations of fragments that are hybridizable toall or portions of the microbial sequence may be added. Similarly, apopulation of fragments which are not hybridizable to the instantsequence may also be added. Typically these additional fragmentpopulations are added in about a 10 to 20 fold excess by weight ascompared to the total nucleic acid. Generally if this process isfollowed the number of different specific nucleic acid fragments in themixture will be about 100 to about 1000. The mixed population of randomnucleic acid fragments are denatured to form single-stranded nucleicacid fragments and then reannealed. Only those single-stranded nucleicacid fragments having regions of homology with other single-strandednucleic acid fragments will reanneal. The random nucleic acid fragmentsmay be denatured by heating. One skilled in the art could determine theconditions necessary to completely denature the double stranded nucleicacid. Preferably the temperature is from 80° C to 100° C. The nucleicacid fragments may be reannealed by cooling. Preferably the temperatureis from 20° C. to 75° C. Renaturation can be accelerated by the additionof polyethylene glycol (“PEG”) or salt. A suitable salt concentrationmay range from 0 mM to 200 mM. The annealed nucleic acid fragments arethen incubated in the presence of a nucleic acid polymerase and dNTP's(i.e. dATP, dCTP, dGTP and dTTP). The nucleic acid polymerase may be theKlenow fragment, the Taq polymerase or any other DNA polymerase known inthe art. The polymerase may be added to the random nucleic acidfragments prior to annealing, simultaneously with annealing or afterannealing. The cycle of denaturation, renaturation and incubation in thepresence of polymerase is repeated for a desired number of times.Preferably the cycle is repeated from 2 to 50 times, more preferably thesequence is repeated from 10 to 40 times. The resulting nucleic acid isa larger double-stranded polynucleotide ranging from about 50 bp toabout 100 kb and may be screened for expression and altered activity bystandard cloning and expression protocol. (Manatis supra).

[0170] Furthermore, a hybrid protein can be assembled by fusion offunctional domains using the gene shuffling (exon shuffling) method(Nixon et al, PNAS, 94:1069-1073 (1997)). The functional domain of theinstant gene can be combined with the functional domain of other genesto create novel enzymes with desired catalytic function. A hybrid enzymemay be constructed using PCR overlap extension method and cloned intothe various expression vectors using the techniques well known to thoseskilled in art.

EXAMPLES

[0171] The present invention is further defined in the followingExamples. It should be understood that these Examples, while indicatingpreferred embodiments of the invention, are given by way of illustrationonly. From the above discussion and these Examples, one skilled in theart can ascertain the essential characteristics of this invention, andwithout departing from the spirit and scope thereof, can make variouschanges and modifications of the invention to adapt it to various usagesand conditions.

[0172] General Methods

[0173] Standard recombinant DNA and molecular cloning techniques used inthe Examples are well known in the art and are described by Sambrook,J., Fritsch, E. F. and Maniatis, T. Molecular Cloning: A LaboratoryManual; Cold Spring Harbor Laboratory Press: Cold Spring Harbor, (1989)(Maniatis) and by T. J. Silhavy, M. L. Bennan, and L. W. Enquist,Experiments with Gene Fusions, Cold Spring Harbor Laboratory, ColdSpring Harbor, N.Y. (1984) and by Ausubel, F. M. et al., CurrentProtocols in Molecular Biology, pub. by Greene Publishing Assoc. andWiley-Interscience (1987).

[0174] Materials and methods suitable for the maintenance and growth ofbacterial cultures are well known in the art. Techniques suitable foruse in the following examples may be found as set out in Manual ofMethods for General Bacteriology (Phillipp Gerhardt, R. G. E. Murray,Ralph N. Costilow, Eugene W. Nester, Willis A. Wood, Noel R. Krieg andG. Briggs Phillips, Eds., American Society for Microbiology, Washington,D.C. (1994)) or by Thomas D. Brock in Biotechnology: A Textbook ofIndustrial Microbiology, Second Ed., Sinauer Associates, Inc.:Sunderland, Mass. (1989). All reagents, restriction enzymes andmaterials used for the growth and maintenance of bacterial cells wereobtained from Aldrich Chemicals (Milwaukee, Wis.), DIFCO Laboratories(Detroit, Mich.), GIBCO/BRL (Gaithersburg, Md.), or Sigma ChemicalCompany (St. Louis, Mo.) unless otherwise specified.

[0175]Bacterial Strains and Plasmids: Rhodococcus erythropolis AN12,Brevibactenum sp. HCU, Arthrobacter sp. BP2, Rhodococcus sp. phi1,Rhodococcus sp. phi2, Acidovorax sp. CHX, and Acinetobacter sp. SE19were isolated from enrichment of activated sludge obtained fromindustrial wastewater treatment facilities. Max Efficiency competentcells of E. coli DH5α and DH10B were purchased from GIBCO/BRL(Gaithersburg, Md.). Expression plasmid pQE30 were purchased from Qiagen(Valencia, Calif.), while cloning vector pCR2.1 and expression vectorpTrc/His2-Topo were purchased from Invitrogen (San Diego, Calif.).

[0176] Taxonomic identification of Rhodococcus erythropolis AN12,Brevibacterium sp. HCU, Arthrobacter sp. BP2, Rhodococcus sp. phi1,Rhodococcus sp. phi2, Acidovorax sp. CHX, and Acinetobacter sp. SE19 wasperformed by PCR amplification of 16S rDNA from chromosomal DNA usingprimers corresponding to conserved regions of the 16S rDNA molecule(Table 2). The following temperature program was used: 95° C. (5 min)for 1 cycle followed by 25 cycles of: 95° C. (1 min), 55° C. (1 min),72° C. (1 min), followed by a final extension at 72° C. (8 min).Following DNA sequencing (according to the method shown below), the 16SrDNA gene sequence of each isolate was used as the query sequence for aBLAST search (Altschul, et al., Nucleic Acids Res. 25:3389-3402 (1997))against GenBank for similar sequences. TABLE 2 Primers to ConservedRegions of 16s rDNA SEQ ID NO Primer Sequence (5′-3′) Reference 50GAGTTTGATCCTGGCTC (HK12) Amann, R. I. et al. Microbial. AG Rev. 59(1):143-69 (1995) 51 CAGG(A/C)GCCGCGGTA Amann, R. I. et al. Microbial. Rev.AT(A/T)C 59(1): 143-69 (1995) 52 GCTGCCTCCCGTAGGA (HK21) Amann, R. I. etal. Microbial. GT Rev. 59(1): 143-69 (1995) 53 CTACCAGGGTAACTAAT Amann,R. I. et al. Microbial. Rev. CC 59(1): 143-69 (1995) 54 ACGGGCGGTGTGTACAmann, R. I. et al. Microbial. Rev. 59(1): 143-69 (1995) 55CACGAGCTGACGACAG Amann, R. I. et al. Microbial. Rev. CCAT 59(1): 143-69(1995) 56 TACCTTGTTACGACTT (HK13) Amann, R. I. et al. Microbial. Rev.59(1): 143-69 (1995) 57 G(A/T)ATTACCGCGGC Amann, R. I. et al. Microbial.Rev. (G/T)GCTG 59(1): 143-69 (1995) 58 GGATTAGATACCCTGGT Amann, R. I. etal. Microbial. Rev. AG 59(1): 143-69 (1995) 59 ATGGCTGTCGTCAGCT Amann,R. I. et al. Microbial. Rev. CGTG 59(1): 143-69 (1995) 60GCCCCCG(C/T)CAATTC (HK15) Kane, M. D. et al. Appl. CT Environ.Microbial. 59: 682-686 (1993) 61 GTGCCAGCAG(C/T)(A/C) (HK14) Kane, M. D.et al. Appl. GCGGT Environ. Microbial. 59: 682-686 (1993) 62GCCAGCAGCCGCGGTA (JCR15) Kane, M. D. et al. Appl. Environ. Microbial.59: 682-686 (1993)

[0177] Sequencing

[0178] Sequence was generated on an ABI Automatic sequencer using dyeterminator technology (U.S. Pat. No. 5,366,860; EP 272007) using acombination of vector and insert-specific primers. Sequence editing wasperformed using either Sequencher (Gene Codes Corp., Ann Arbor, Mich.),or the Wisconsin GCG program (Wisconsin Package Version 9.0, GeneticsComputer Group (GCG), Madison, Wis.) and the CONSED package (version7.0). All sequences represent coverage at least two times in bothdirections.

[0179] Manipulations of genetic sequences were accomplished using thesuite of programs available from the Genetics Computer Group Inc.(Wisconsin Package Version 9.0, Genetics Computer Group (GCG), Madison,Wis.). Where the GCG program “Pileup” was used, the gap creation defaultvalue of 12 and the gap extension default value of 4 were used. Wherethe GCG “Gap” or “Bestfit” programs were used, the default gap creationpenalty of 50 and the default gap extension penalty of 3 were used. Inany case where GCG program parameters were not prompted for, in these orany other GCG program, default values were used.

[0180] The meaning of abbreviations is as follows: “sec” meanssecond(s), “min” means minute(s), “h” means hour(s), “d” means day(s),“μL” means microliter, “mL” means milliliters, “L” means liters, “μM”means micromolar, “mM” means millimolar, “M” means molar, “mmol” meansmillimole(s), “μmole” mean micromole”, “g” means gram, “μg” meansmicrogram, “ng” means nanogram, “U” means units, “mU” means milliunits,“ppm” means parts per million, “psi” means pounds per square inch, and“kB” means kilobase.

Example 1 Monooxygenase Gene Discovery in a Mixed Microbial Population

[0181] This Example describes the isolation of the cyclohexanonedegrading organisms Arthrobacter sp. BP2, Rhodococcus sp. phi1, andRhodococcus sp. phi2 by enrichment of a mixed microbial community.Differential display techniques applied to cultures containing the mixedmicrobial population permitted discovery of monooxygenase genes.

[0182] Enrichment for Cyclohexanone Degraders

[0183] A mixed microbial community was obtained from a wastewaterbioreactor and maintained on minimal medium (50 mM KHPO₄ (pH 7.0), 10 mM(NH₄)SO₄, 2 mM MgCl₂, 0.7 mM CaCl₂, 50 μM MnCl₂, 1 μM FeCl₃, 1 μM ZnCl₃,1.72 μM CuSO₄, 2.53 μM CoCl₂, 2.42 μM Na₂MoO₂, and 0.0001% FeSO₄) withtrace amounts of yeast extract casamino acids and peptone (YECAAP) at0.1% concentration with 0.1% cyclohexanol and cyclohexanone added ascarbon sources. Increased culture growth in the presence ofcyclohexanone indicated a microbial population with members that couldconvert cyclohexanone.

[0184] Isolation of Strains

[0185] Seven individual strains were isolated from the community byspreading culture on R2A Agar (Becton Dickinson and Company,Cockeysville, Md.) at 30° C. Strains were streaked to purity on the samemedium. Among these seven strains, the strain identified as Arthrobacterspecies BP2 formed large colonies of a light yellow color. OneRhodococcus strain, identified as species phi1, formed small coloniesthat were orange in color. The other Rhodococcus strain, designatedspecies phi2, formed small colonies that were red in color.

[0186] Individuals strains were identified by comparing 16s rDNAsequences to known 16S rRNA sequences in the GenBank sequence database.The 16S rRNA gene sequence from strain BP2 (SEQ ID NO:1) was at least99% homologous to the 16S rRNA gene sequences of bacteria belonging tothe genus Arthrobacter. The 16S rRNA gene sequences from strains phi1and phi2 were each at least 99% homologous to the 16S rRNA genesequences of bacteria belonging to the genus of gram positive bacteria,Rhodococcus. The complete 16s DNA sequence of Rhodococcus sp. phi1 isshown as SEQ ID NO:2, while that of Rhodococcus sp. phi2 is listed asSEQ ID NO:3.

[0187] Induction of Cyclohexanone Oxidation Genes

[0188] For induction of cyclohexanone oxidation genes within members ofthis community, 1 ml of inoculum from a waste water bioreactor wassuspended in 25 ml minimal medium with 0.1% YECAAP and incubatedovernight at 30° C. with agitation. The next day 10 ml of the overnightculture was resuspended in a total volume of 50 ml minimal medium with0.1% YECAAP. The optical density of the culture was 0.29 absorbanceunits at 600 nm. After equilibration at 30° C. for 30 min, the culturewas split into two separate 25 ml volumes. To one of these cultures, 25μl (0.1%) cyclohexanone (Sigma-Aldrich, St. Louis, Mo.) was added. Bothcultures were incubated for an additional 3 hrs. At this time, cultureswere moved onto ice, harvested by centrifugation at 4° C., washed withtwo volumes of minimal salts medium and diluted to an optical density of1.0 absorbance unit (600 nm). Approximately 6 ml of culture was placedin a water jacketed respirometry cell equipped with an oxygen electrode(Yellow Springs Instruments Co., Yellow Springs, Ohio) at 30° C. toconfirm cyclohexanone enzymes were induced. After establishing thebaseline respiration for each cell suspension, cyclohexanone was addedto a final concentration of 0.1% and the rate of O₂ consumption wasfurther monitored. For the control culture, 2 mM potassium acetate wasadded 200 sec after the cyclohexanone.

[0189] Isolation of Total Community RNA

[0190] After the 3 hr induction period with cyclohexanone describedabove, the control and induced sample (2 mL each) were harvested at 1400rpm in a 4° C. centrifuge and resuspended in 900 μl Buffer RLT (Qiagen,Valencia, Calif.). A 300 μl volume of zirconia beads (Biospec Products,Bartlesville, Okla.) was added and cells were disrupted using a beadbeater (Biospec Products) at 2400 beats per min for 3 min. Each of thesesamples was split into six aliquots for nucleic acid isolation using theRNeasy Mini Kit (Qiagen, Valencia, Calif.) and each was eluted with 100RNase-free dH₂O supplied with the kit. DNA was degraded in the samplesusing 10 mM MgCl₂, 60 mM KCl and 2 U RNase-free DNase I (Ambion, Austin,Tex.) at 37° C. for 4 hr. Following testing for total DNA degradation byPCR using one of the arbitrary oligonucleotides used for RT-PCR, RNA waspurified using the RNeasy Mini Kit and eluted in 100 μl RNase-free dH₂Oas described previously.

[0191] Generation of RAPDs from Arbitrarily Reverse-transcribed TotalRNA

[0192] A set of 244 primers with the sequence CGGAGCAGATCGAVVVV (SEQ IDNO:63); where VVVV represent all the combinations of the three bases A,G and C) was used in separate RT-PCR reactions as with RNA from eitherthe control or induced cells. The SuperScript™ One-Step™ RT-PCR System(Life Technologies Gibco BRL, Rockville, Md.) reaction mixture was usedwith 2-5 ng of total RNA in a 25 μtotal reaction volume. The PCR wasconducted using the following temperature program:

[0193] 1 cycle: 4° C. (2 min), 5 min ramp to 37° C. (1 hr), followed by95° C. incubation (3 min);

[0194] 1 cycle: 94° C. (1 min), 40° C. (5 min), and 72° C. (5 min);

[0195] 40 cycles: 94° C. (1 min), 60° C. (1 min), and 72° C. (1 min);

[0196] 1 cycle: 70° C. (5 min) and 4° C. hold until separated byelectorphoresis.

[0197] Products of these PCR amplifications (essentially RAPD fragments)were separated by electrophoresis at 1 V/cm on polyacrylamide gels(Amersham Pharmacia Biotech, Piscataway, N.J.). Products resulting fromthe control mRNA (no cyclohexanone induction) and induced mRNA fragmentswere visualized by silver staining using an automated gel stainer(Amersham Pharmacia Biotech, Piscataway, N.J.).

[0198] Reamplification of Differentially Expressed DNA Fragments

[0199] A 25 μl, volume of a sodium cyanide elution buffer (10 mg/mlNaCN, 20 mM Tris-HCl (pH 8.0), 50 mM KCl and 0.05% NP40) was incubatedwith an excised gel band of a differentially display fragment at 95° C.for 20 min. Reamplification of this DNA fragment was achieved in a PCRreaction using 5 μl of the elution mixture in a 25 μl reaction using theprimer from which the fragment was originally generated. The temperatureprogram for reamplification was: 94° C. (5 min); 20 cycles of 94° C. (1min), 55° C. (1 min), and 72 ° C (1 min); followed by 72° C. (7 min Thereamplification products were directly cloned into the pCR2.1-TOPOvector (Invitrogen, Carlsbad, Calif.) and were sequenced using an ABImodel 377 with ABI BigDye terminator sequencing chemistry (PerseptiveBiosystems, Framinham, Mass.). Eight clones were submitted forsequencing from each reamplified band. The nucleotide sequence of thecloned fragments was compared against the non-redundant GenBank databaseusing the BlastX program (NCBI).

[0200] Sequencing of Cyclohexanone Oxidation Pathway Genes

[0201] Oligonucleotides were designed to amplify by PCR individualdifferentially expressed fragments. Following DNA isolation fromindividual strains, these oligonucleotide primers were used to determinewhich strain contained DNA encoding the individual differentiallyexpressed fragments. Cosmids were screened by PCR using primers designedagainst differentially displayed fragments with homology to knowncyclohexanone degradation genes. Each recombinant E. coli cell culturecarrying a cosmid clone (1.0 μl) was used as the template in a 25 μl PCRreaction mixture. The primer pair A102FI (SEQ ID NO:108) and CONR (SEQID NO:109) was used to screen the Arthrobacter sp. BP2 library, primerpair A228FI (SEQ ID NO:110) and A228RI (SEQ ID NO:l 11) was used toscreen the Rhodococcus sp. phi2 library, and the primer pair of A2Fl(SEQ ID NO:1 12) and A34RI (SEQ ID NO:113) was used to screen theRhodococcus sp. phi1 library. Cosmids from recombinant E. coli whichproduced the correct product size in PCR reactions were isolated,digested partially with Sau3Al and 10-15 kB fragments from this partialdigest were sub-cloned into the blue/white screening vector pSU19(Bartolome, B. et al. Gene. 102(1): 75-8 (Jun 15, 1991); Martinez, E. etal. Gene. 68(1): 159-62 (Aug 15, 1988)). These sub-clones were isolatedusing Qiagen Turbo96 Miniprep kits and re-screened by PCR as previouslydescribed. Sub-clones carrying the correct sequence fragment weretransposed with pGPS1.1 using the GPS-1 Genome Priming System kit (NewEngland Biolabs, Inc., Beverly, Mass.). A number of these transposedplasmids were sequenced from each end of the transposon to obtainkilobase long DNA fragments. Sequence assembly was performed with theSequencher program (Gene Codes Corp., Ann Arbor Mich.).

Example 2 Isolation of Brevibacterium sp. HCU Monooxygenase GenesInvolved In The Oxidation Of Cyclohexanone

[0202] This Example describes the isolation of the cyclohexanol andcyclohexanone degrader Brevibacterium sp. HCU. Discovery of BVmonooxygenase genes from the organism was accomplished usingdifferential display methods.

[0203] Strain Isolation

[0204] Selection for a halotolerant bacterium degrading cyclohexanol andcyclohexanone was performed on agar plates of a halophilic minimalmedium (Per liter 15 g Agar, 100 g NaCl, 10 g MgSO₄, 2 g KCl, 1 g NH₄Cl,50 mg KH₂PO₄, 2 mg FeSO₄, 8 g, Tris-HCI (pH 7)) containing traces ofyeast extract and casaminoacids (0.005% each) and incubated under vaporsof cyclohexanone at 30° C. The inoculum was a resuspension of sludgefrom industrial wastewater treatment plant. After two weeks, beigecolonies were observed and streaked to purity on fresh agar plates grownunder the same conditions.

[0205] The complete 16s DNA sequence of the isolated Brevibacterium sp.HCU was found to be unique and is shown as SEQ ID NO:4. Comparison toother 16S rRNA sequences in the GenBank sequence database found the 16SrRNA gene sequence from strain HCU was at least 99% homologous to the16S rRNA gene sequences of bacteria belonging to the genusBrevibacterium.

[0206] Induction of the Cyclohexanone Degradation Pathway

[0207] Induciblity of the cyclohexanone pathway was tested byrespirometry in low salt medium. One colony of Brevibacterium sp. HCUwas inoculated in 300 ml of S12 mineral medium (50 mM KHPO₄ buffer (pH7.0), 10 mM (NH4)₂SO₄, 2 mM MgCl₂, 0.7 mM CaCl₂, 50 uM MnCl₂, 1 μMFeCl₃, 1 μM ZnCl₃, 1.72 μM CuSO₄, 2.53 μM CoCl₂, 2.42 μM Na₂MoO₂, and0.0001% FeSO₄) containing 0.005% yeast extract. The culture was thensplit into two flasks which received respectively 10 mM acetate and 10mM cyclohexanone. Each flask was incubated for 6 hrs at 30° C. to allowfor the induction of the cyclohexanone degradation genes. The cultureswere then chilled on iced, harvested by centrifugation and washed threetimes with ice-cold S12 medium lacking traces of yeast extract. Cellswere finally resuspended to an optical density of 2.0 at 600 nm and kepton ice until assayed.

[0208] Half a ml of each culture was placed in a water jacketedrespirometry cell equipped with an oxygen electrode (Yellow SpringInstruments Co., Yellow spring, Ohio) and containing 5 ml of airsaturated S12 medium at 30° C. After establishing the baselinerespiration for each of the cell suspensions, acetate or cyclohexanonewas added to a final concentration of 0.02% and the rate of O₂consumption was further monitored.

[0209] Identification of Cyclohexanone Oxidation Genes

[0210] Identification of genes involved in the oxidation ofcyclohexanone made use of the fact that this oxidation pathway isinducible. The mRNA populations of a control culture and acyclohexanone-induced culture were compared using a technique based onthe random amplification of DNA fragments by reverse transcriptionfollowed by PCR.

[0211] Isolation of Total Cellular RNA

[0212] The cyclohexanone oxidation pathway was induced by addition of0.1% cyclohexanone into one of two “split ” 10 ml cultures ofBrevibacterium sp. HCU grown in S12 medium. Each culture was chilledrapidly in an ice-water bath and transferred to a 15 ml tube. Cells werecollected by centrifugation for 2 min at 12,000×g in a rotor chilled to−4° C. The supernatants were discarded, the pellets resuspended in 0.7ml of ice-cold solution of 1% SDS and 100 mM sodium acetate at pH 5 andtransferred to a 2 ml tube containing 0.7 ml of aqueous phenol pH 5 and0.3 ml of 0.5 mm zirconia beads (Biospec Products, Bartlesville, Okla.).The tubes were placed in a bead beater (Biospec) and disrupted at 2,400beats per min for two min.

[0213] Following the disruption of the cells, the liquid phases of thetubes were transferred to new microfuge tubes and the phases separatedby centrifugation for 3 min at 15,000×g. The aqueous phase containingtotal RNA was extracted twice more with phenol at pH 5 and twice with amixture of phenol/chloroform/isoamyl alcohol pH 7.5 until a precipitatewas no longer visible at the phenol/water interface. Nucleic acids werethen recovered from the aqueous phase by ethanol precipitation withthree volumes of ethanol and the pellet resuspended in 0.5 ml of diethylpyrocarbonate (DEPC) treated water. DNA was digested by 6 units ofRNAse-free DNAse (Boehringer Mannheim, Indianapolis, Ind.) for 1 hr at37° C. The total RNA solution was then extracted twice withphenol/chloroform/isoamyl alcohol pH 7.5, recovered by ethanolprecipitation and resuspended in 1 ml of DEPC treated water to anapproximate concentration of 0.5 mg per ml.

[0214] Generation of RAPDs Patterns From Arbitrarily Reverse-TranscribedTotal RNA

[0215] Arbitrarily amplified DNA fragments were generated from the totalRNA of control and induced cells by following the protocol described byWong K. K. et al. (Proc Natl Acad Sci USA. 91:639 (1994)). A series ofparallel reverse transcription (RT)/PCR amplification experiments wereperformed using a RT-PCR oligonucleotide set. This set consisted of 81primers, each designed with the sequence CGGAGCAGATCGAVVVV (SEQ IDNO:63) where VVVV represent all the combinations of the three bases A, Gand C at the last four positions of the 3′-end.

[0216] The series of parallel RT-PCR amplification experiments wereperformed on the total RNA from the control and induced cells, eachusing a single RT-PCR oligonucleotide. Briefly, 50 Iii reversetranscription (RT) reactions were performed on 20-100 ng of total RNAusing 100 U Moloney Murine Leukemia Virus (MMLV) reverse transcriptase(Promega, Madison, Wis.) with 0.5 mM of each dNTP and 1 mM for eacholigonucleotide primer. Reactions were prepared on ice and incubated at37° C. for 1 hr.

[0217] Five μl from each RT reaction were then used as template in a 50μl PCR reaction containing the same primer used for the RT reaction(0.25 μM), dNTPs (0.2 mM each), magnesium acetate (4 mM) and 2.5 U ofthe Taq DNA polymerase Stoffel fragment (Perkin Elmer, Foster City,Calif.). The following temperature program was used: 94° C. (5 min), 40°C. (5 min), 72° C. (5 min) for 1 cycle followed by 40 cycles of 94° C.(1 min), 60° C. (1 min), 72° C. (5 min).

[0218] RAPD fragments were separated by electrophoresis on acrylamidegels (15 cm×15 cm×1.5 mm, 6% acrylamide, 29:1 acryl:bisacrylamide, 100mM Tris, 90 mM borate, 1 mM EDTA pH 8.3). Five μl from each PCR reactionwere analyzed with the reactions from the control and the induced RNAfor each primer running side by side. Electrophoresis was performed at 1V/cm. DNA fragments were visualized by silver staining using the PlusOne® DNA silver staining kit in the Hoefer automated gel stainer(Amersham Pharmacia Biotech, Piscataway, N.J.).

[0219] Reamplification of the Differentially Expressed DNA

[0220] Stained gels were rinsed extensively for one hr with distilledwater. Bands generated from the RNA of cyclohexanone induced cells butabsent in the reaction from the RNA of control cells were excised fromthe gel and placed in a tube containing 50 μl of 10 mM KCl and 10 mMTris-HCI (pH 8.3) and heated to 95° C. for 1 hr to allow some of the DNAto diffuse out of the gel. Serial dilutions of the eluate over a 200fold range were used as template for a new PCR reaction using the Taqpolymerase. The primer used for each reamplification (0.25 μM) was theone that had generated the pattern.

[0221] Each reamplified fragment was cloned into the blue/white cloningvector pCR2.1 (Invitrogen, San Diego, Calif.) and sequenced using theuniversal forward and reverse primers (M13 Reverse Primer (SEQ ID NO:64)and M13 (−20) Forward Primer (SEQ ID NO:65).

[0222] Extension of Monooxygenase Fragments by Out-PCR.

[0223] Kilobase-long DNA fragments extending the sequences fragmentsidentified by differential display were generated by “Out-PCR”, a PCRtechnique using an arbitrary primer in addition to a sequence specificprimer. The first step of this PCR-based gene walking techniqueconsisted of randomly copying the chromosomal DNA using a primer ofarbitrary sequence in a single round of amplification under lowstringency conditions. The primers used for Out-PCR were chosen from aprimer set used for mRNA differential display and their sequences wereCGGAGCAGATCGAVVVV (SEQ ID NO:63) where VVVV was A, G or C. Ten Out-PCRreactions were performed, each using one primer of arbitrary sequence.The reactions (50 μl) included a 1× concentration of the rTth XL bufferprovided by the manufacturer (Perkin-Elmer, Foster City, Calif.), 1.2 mMmagnesium acetate, 0.2 mM of each dNTP, 10-100 ng genomic DNA, 0.4 mM ofone arbitrary primer and 1 unit of rTth XL polymerase (Perkin-Elmer). Afive min annealing (45° C.) and 15 min extension cycle (72° C.) lead tothe copying of the genomic DNA at arbitrary sites and the incorporationof a primer of arbitrary but known sequence at the 3′ end.

[0224] After these initial low stringency annealing and replicationsteps, each reaction was split into two tubes. One tube received aspecific primer (0.4 mM) designed against the end of the sequence to beextended and directed outward, while the second tube received water andwas used as a control. Thirty additional PCR cycles were performed underhigher stringency conditions with denaturization at 94° C. (1 min),annealing at 60° C. (0.5 min) and extension at 72° C. (10 min). The longextension time was designed to allow for the synthesis of long DNAfragments by the long range rTth XL DNA polymerase. The products of eachpair of reactions were analyzed in adjacent lanes on an agarose gel.

[0225] Bands present in the sample having received the specific primerbut not in the control sample were excised from the agarose gel, meltedin 0.5 ml H₂O and used as the template in a new set of PCR reactions. A1× concentration of rTth XL buffer, 1.2 mM magnesium acetate, 0.2 mM ofeach dNTP, 0.4 mM of primers, 1/1000 dilution of the melted slice and 1unit of rTth XL polymerase were used for these reactions. The PCR wasperformed at 94° C. (1 min), 60° C. (0.5 min), and 72° C. (15 min) percycle for 20 cycles. For each of these reamplification reactions, twocontrol reactions, lacking either the arbitrary primer or the specificprimer, were included in order to confirm that the reamplification ofthe band of interest required both the specific and arbitrary primer.DNA fragments that required both the specific and arbitrary primer foramplification were sequenced. For sequencing, the long fragmentsobtained by Out-PCR were partially digested with Mbol and cloned intopCR2.1 (Invitrogen, Carlsbad, Calif.). Sequences for these partialfragments were obtained using primers designed against the vectorsequence.

EXAMPLE 3 Isolation of a Acidovorax sp. CHX Monooxygenase Gene Involvedin Degradation of Cyclohexane

[0226] This Example describes the isolation of the cyclohexane degraderAcidovorax sp. CHX. Discovery of a BVMO gene was accomplished usingdifferential display methods.

[0227] Strain Isolation

[0228] An enrichment for bacteria growing on cyclohexane as a solecarbon source was started by adding 5 ml of an industrial wastewatersludge to 20 ml of mineral medium (50 mM KHPO₄ (pH 7.0), 10 mM (NH₄)SO₄,2 mM MgCl₂, 0.7 mM CaCl₂, 50 μM MnCl₂, 1 μM FeCl₃, 1 μM ZnCl₃, 1.72 μMCuSO₄, 2.53 μM CoCl₂, 2.42 μM Na₂MoO₂, and 0.0001% FeSO₄) in a 125 mlErlenmeyer flask sealed with a Teflon lined screw cap. A test tubecontaining 1 ml of a mixture of mineral oil and cyclohexane (8/1 v/v)was fitted in the flask to provide a low vapor pressure of cyclohexane(approximately 30% of the vapor pressure of pure cyclohexane). Theenrichment was incubated at 30° C. for a week. Periodically, 1 to 10dilutions of the enrichment were performed in the same mineral mediumsupplemented with 0.005% of yeast extract under low cyclohexane vapors.After several transfers, white flocks could be seen in the enrichmentsunder cyclohexane vapors. If cyclohexane was omitted, the flocks did notgrow.

[0229] After several transfers, the flocks could be grown with 4 μl ofliquid cyclohexanone added directly to 10 ml of medium. To isolatecolonies, flocks were washed in medium and disrupted by thorough shakingin a bead beater. The cells released from the disrupted flocks werestreaked onto R2A medium agar plates and incubated under cyclohexanevapors. Pinpoint colonies were picked under a dissecting microscope andinoculated in 10 ml of mineral medium supplemented with 0.01% yeastextract and 4 μl of cyclohexane. The flocks were grown, disrupted andstreaked again until a pure culture was obtained.

[0230] Taxonomic identification of this isolate was performed by PCRamplification of 16S rDNA, as described in the General Methods. The 16SrRNA gene sequence from strain CHX was at least 98% homologous to the16S rRNA gene sequence of an uncultured bacterium (Seq. Accession numberAF143840) and 95% homologous to the 16s rRNA gene sequences of the genusAcidovorax termperans (Accession number AF078766). The complete 16s DNAsequence of the isolated Acidovorax sp. CHX is shown as SEQ ID NO:5.

[0231] Induction of Cyclohexane Degradation Genes

[0232] For induction of cyclohexane degradation genes, colonies ofAcidovorax sp. CHX were scraped from an R2A agar plate and inoculatedinto 25 ml R2A broth. This culture was incubated overnight at 30° C. Thenext day 25 ml of fresh R2A broth was added and growth was continued for15 min. The culture was split into two separate flasks, each of whichreceived 25 ml. To one of these flasks, 5 μl of pure cyclohexane wasadded to induce expression of cyclohexane degradation genes. The otherflask was kept as a control. Differential display was used to identifythe Acidovorax sp. CHX monooxygenase gene. Identification of cyclohexaneinduced gene sequences and sequencing cyclohexanone oxidation genes fromstrains was performed in a similar manner as described in Example 1.

EXAMPLE 4 Isolation of a Acinetobacter sp. SE19 Monooxygenase GeneInvolved in Degradation of Cyclohexanol

[0233] This Example describes the isolation of the cyclohexanol degraderAcinetobacter sp. SE19. Discovery of a BV monooxygenase gene wasaccomplished by screening of cosmid libraries, followed by sequencing ofshot-gun libraries.

[0234] Isolation of Strain

[0235] An enrichment for bacteria that grow on cyclohexanol was isolatedfrom a cyclopentanol enrichment culture. The enrichment culture wasestablished by inoculating 1 mL of activated sludge into 20 mL of S12medium (10 mM ammonium sulfate, 50 mM potassium phosphate buffer (pH7.0), 2 mM MgCl₂, 0.7 mM CaCl₂, 50 uM MnCl₂, 1 uM FeCl₃, 1 uM ZnCl₃,1.72 uM CuSO₄, 2.53 uM CoCl₂, 2.42 uM Na₂MoO₂, and 0.0001% FeSO₄) in asealed 125 mL screw-cap Erlenmeyer flask. The enrichment culture wassupplemented with 100 ppm cyclopentanol added directly to the culturemedium and was incubated at 35° C. with reciprocal shaking. Theenrichment culture was maintained by adding 100 ppm cyclopentanol every2-3 days. The culture was diluted every 2-10 days by replacing 10 mL ofthe culture with the same volume of S12 medium. After 15 days ofincubation, serial dilutions of the enrichment culture were spread ontoLB plates. Single colonies were screened for the ability to grow on S12liquid with cyclohexanol as the sole carbon and energy source. Thecultures were grown at 35° C. in sealed tubes. One of the isolates,strain SE19 was selected for further characterization.

[0236] The 16s rRNA genes of SE19 isolates were amplified by PCRaccording to the procedures of the General Methods. Result from allisolates showed that strain SE19 has close homology to Acinetobacterhaemolyticus and Acinetobacter junii, (99% nucleotide identity to each).

[0237] Construction of Acinetobacter Cosmid Libraries

[0238]Acinetobacter sp. SE19 was grown in 25 ml LB medium for 6 h at 37°C. with aeration. Bacterial cells were centrifuged at 6,000 rpm for 10min in a Sorvall RC5C centrifuge at 4° C. Supernatant was decanted andthe cell pellet was frozen at −80° C. Chromosomal DNA was prepared asoutlined below with special care taken to avoid shearing of DNA. Thecell pellet was gently resuspended in 5 ml of 50 mM Tris-10 mM EDTA (pH8) and lysozyme was added to a final concentration of 2 mg/ml. Thesuspension was incubated at 37° C. for 1 h. Sodium dodecyl sulfate wasthen added to a final concentration of 1% and proteinase K was added at100 μg/ml. The suspension was incubated at 55° C. for 2 h. Thesuspension became clear and the clear lysate was extracted with equalvolume of phenol:chloroform:isoamyl alcohol (25:24:1). Aftercentrifuging at 12,000 rpm for 20 min, the aqueous phase was carefullyremoved and transferred to a new tube. Two volumes of ethanol were addedand the DNA was gently spooled with a sealed glass pasteur pipet. TheDNA was dipped into a tube containing 70% ethanol. After air drying, theDNA was resuspended in 400 μ of TE (10 mMTris-1 mM EDTA, pH 8) withRNaseA (100 μg/ml) and stored at 4° C. The concentration and purity ofDNA was determined spectrophotometrically by OD₂₆₀/OD₂₈₀. A dilutedaliquot of DNA was run on a 0.5% agarose gel to determine the intactnature of DNA.

[0239] Chromosomal DNA was partially digested with Sau3Al (GIBRO/BRL,Gaithersburg, Md.) as outlined by the instruction manual for theSuperCos 1 Cosmid Vector Kit. DNA (10 μg) was digested with 0.5 unit ofSau3Al at room temperature in 100 μl of reaction volume. Aliquots of 20μl were withdrawn at various time points of the digestion: e.g., 0, 3,6, 9, 12 min. DNA loading buffer was added and samples were analyzed ona 0.5% agarose gel to determine the extent of digestion. A decrease insize of chromosomal DNA corresponded to an increase in the length oftime for Sau3Al digestion. The preparative reaction was performed using50 μg of DNA digested with 1 unit of Sau3Al for 3 min at roomtemperature. The digestion was terminated by addition of 8 mM of EDTA.The DNA was extracted once with phenol:chloroform:isoamyl alcohol andonce with chloroform. The aqueous phase was adjusted to 0.3 M NaOAc andethanol precipitated. The partially digested DNA was dephosphorylatedwith calf intestinal alkaline phosphatase and ligated to SuperCos 1vector, which had been treated according to the instructions in theSuperCos 1 Cosmid Vector Kit. The ligated DNA was packaged into lamdaphage using Gigapack III XL packaging extract, as recommended byStratagene (manufacturer's instructions were followed). The packagedAcinetobactergenomic DNA library contained a phage titer of 5.6×10⁴colony forming units per μg of DNA as determined by transfecting E. coliXL1-Blue MR. Cosmid DNA was isolated from six randomly chosen E. colitransformants and found to contain large inserts of DNA (25-40 kb).

[0240] Identification and Characterization of Cosmid Clones Containing aCyclohexanone Monooxygenase Gene

[0241] The cosmid library of Acinetobacter sp. SE19 was screened basedon the homology of the cyclohexanone monooxygenase gene. Two primers,monoL: GAGTCTGAGCATATGTCACAAAAAATGGATTTTG (SEQ ID NO:66) and monoR:GAGTCTGAGGGATCCTTAGGCATTGGCAGGTTGCTTGAT (SEQ ID NO:67) were designedbased on the published sequence of cyclohexanone monooxygenase gene ofAcinetobacter sp. NCIB 9871. The cosmid library was screened by PCRusing monoL and monoR primers. Five positive clones (5B12, 5F5, 8F6,14B3 and 14D7) were identified among about 1000 clones screened. Theyall contain inserts of 35-40 kb that show homology to the cyclohexanonemonooxygenase gene amplified by monoL and monoR primers. Southernhybridization using this gene fragment as a probe indicated that thecosmid clone 5B12 has about 20 kb region upstream of the monooxygenasegene and cosmid clone 8F6 has about 30 kb downstream of themonooxygenase gene. Cosmid clone 14B3 contains rearranged AcinetobacterDNA adjacent to the monooxygenase gene.

[0242] Construction of Shot-gun Sequencing Libraries

[0243] Shot gun libraries of 5B1 2 and 8F6 were constructed. Cosmid DNAwas sheared in a nebulizer (Inhalation Plastics Inc., Chicago, Ill.) at20 psi for 45 sec and the 1-3 kb portion was gel purified. Purified DNAwas treated with T4 DNA polymerase and T4 polynucleotide kinasefollowing manufacturer's (GIBCO/BRL) instructions. Polished inserts wereligated into pUC18 vectors using Ready-To-Go pUC18Smal/BAP+Ligase(GIBCO/BRL). The ligated DNA was transformed into E. coli DH5α cells andplated on LB with ampicillin and X-gal. A majority of the transformantswere white and those containing inserts were sequenced with theuniversal and reverse primers of pUC18 by standard sequencing methods.

[0244] Shot gun library inserts were sequenced with pUC18 universal andreverse primers. Sequences of 200-300 clones from each library wereassembled using Sequencher 3.0 program. A contig of 17419 bp containingthe cyclohexanone monooxygenase gene was formed.

Example 5

[0245] Isolation and Sequencing of Rhodococcus erythronolis AN12

[0246] This Example describes isolation of Rhodococcus erythropolis AN12strain from wastestream sludge. A shotgun sequencing strategy approachpermitted sequencing of the entire microbial genome.

[0247] Isolation of Rhodococcus erythropolis AN12

[0248] Strain AN12 of Rhodococcus erythropolis was isolated on the basisof ability to grow on aniline as the sole source of carbon and energy.Bacteria that grow on aniline were isolated from an enrichment culture.The enrichment culture was established by inoculating 1 ml of activatedsludge into 10 ml of S12 medium (10 mM ammonium sulfate, 50 mM potassiumphosphate buffer (pH 7.0), 2 mM MgCl₂, 0.7 mM CaCI₂, 50 μM MnCl₂, 1 μMFeCl₃, 1 μM ZnCl₃, 1.72 μM CuSO₄, 2.53 μM CoCl₂, 2.42 μM Na₂MoO₂, and0.0001% FeSO₄) in a 125 ml screw cap Erlenmeyer flask. The activatedsludge was obtained from a DuPont wastewater treatment facility. Theenrichment culture was supplemented with 100 ppm aniline added directlyto the culture medium and was incubated at 25° C. with reciprocalshaking. The enrichment culture was maintained by adding 100 ppm ofaniline every 2-3 days. The culture was diluted every 14 days byreplacing 9.9 ml of the culture with the same volume of S12 medium.Bacteria that utilize aniline as a sole source of carbon and energy wereisolated by spreading samples of the enrichment culture onto S12 agar.Aniline was placed on the interior of each petri dish lid. The petridishes were sealed with parafllm and incubated upside down at roomtemperature (25° C.). Representative bacterial colonies were then testedfor the ability to use aniline as a sole source of carbon and energy.Colonies were transferred from the original S12 agar plates used forinitial isolation to new S12 agar plates and supplied with aniline onthe interior of each petri dish lid. The petri dishes were sealed withparafilm and incubated upside down at room temperature (25° C.).

[0249] A 16S rRNA gene of strain AN12 was sequenced (SEQ ID NO:6) asdescribed in the General Methods and compared to other 16S rRNAsequences in the GenBank sequence database. The 16S rRNA gene sequencefrom strain AN12 was at least 98% homologous to the 16S rRNA genesequences of high G+C Gram positive bacteria belonging to the genusRhodococcus.

[0250] Preparation of Genomic DNA for Sequencing and Sequence Generation

[0251] Genomic DNA and library construction were prepared according topublished protocols (Fraser et al. Science 270(5235): 397-403 (1995)). Acell pellet was resuspended in a solution containing 100 mM Na-EDTA (pH8.0), 10 mM Tris-HCl (pH 8.0), 400 mM NaCl, and 50 mM MgCl₂.

[0252] Genomic DNA Preparation

[0253] After resuspension, the cells were gently lysed in 10% SDS, andincubated for 30 minutes at 55° C. After incubation at room temperature,proteinase K (Boehringer Mannheim, Indianapolis, Ind.) was added to 100μg/ml and incubated at 37° C. until the suspension was clear. DNA wasextracted twice with Tris-equilibrated phenol and twice with chloroform.DNA was precipitated in 70% ethanol and resuspended in a solutioncontaining 10 mM Tris-HCI and 1 mM Na-EDTA (TE buffer) pH 7.5. The DNAsolution was treated with a mix of RNAases, then extracted twice withTris-equilibrated phenol and twice with chloroform. This was followed byprecipitation in ethanol and resuspension in TE buffer.

[0254] Library Construction

[0255] 200 to 500 μg of chromosomal DNA was resuspended in a solution of300 mM sodium acetate, 10 mM Tris-HCl, 1 mM Na-EDTA, and 30% glycerol,and sheared at 12 psi for 60 sec in an Aeromist Downdraft Nebulizerchamber (IBI Medical products, Chicago, Ill.). The DNA was precipitated,resuspended and treated with Bal31 nuclease (New England Biolabs,Beverly, Mass.). After size fractionation, a fraction (2.0 kb, or 5.0kb) was excised, cleaned and a two-step ligation procedure was used toproduce a high titer library with greater than 99% single inserts.

[0256] Sequencing

[0257] A shotgun sequencing strategy approach was adopted for thesequencing of the whole microbial genome (Fleischmann, R. et al.Whole-Genome Random sequencing and assembly of Haemophilus influenzaeRd. Science 269(5223): 496-512 (1995)).

Example 6 Identification and Characterization of Bacterial Genes

[0258] Genes encoding each monooxygenase were identified by conductingBLAST (Basic Local Alignment Search Tool; Altschul, S. F., et al.,(1993) J. MoL Biol. 215:403410; see also www.ncbi.nim.nih.gov/BLAST/)searches for similarity to sequences contained in the BLAST “nr”database (comprising all non-redundant GenBank CDS translations,sequences derived from the 3-dimensional structure Brookhaven ProteinData Bank, the SWISS-PROT protein sequence database, EMBL, and DDBJdatabases). The sequences obtained in Examples 1, 2, 3, 4, and 5 wereanalyzed for similarity to all publicly available DNA sequencescontained in the “nr” database using the BLASTN algorithm provided bythe National Center for Biotechnology Information (NCBI). The DNAsequences were translated in all reading frames and compared forsimilarity to all publicly available protein sequences contained in the“nr” database using the BLASTX BLOSUM62 algorithm with a gap exisitensecost of 11 per residue gap cost of 2, filtered, gap alignment (Gish, W.and States, D. J. Nature Genetics 3:266-272 (1993)) provided by theNCBI.

[0259] All comparisons were done using either the BLASTNnr or BLASTXnralgorithm. The results of the BLAST comparisons are given in Table 3which summarize the sequence to which each sequence has the mostsimilarity. Table 3 displays data based on the BLASTXnr algorithm withvalues reported in expect values. The Expect value estimates thestatistical significance of the match, specifying the number of matches,with a given score, that are expected in a search of a database of thissize absolutely by chance. TABLE 3 Gene Name and SEQ SEQ ORF Organism ofID ID % % Name Isolation Similarity Identified base Peptide Identity^(a)Similarity^(b) E-value^(c) Citation 1 chnB >gb|AAG10021.1|AF282240_5 7 855 71 e−174 Cheng, Q., et al. J. Rhodococcus (AF282240) cyclohexanoneBacteriol. 182: 4744-4751 sp. phi 1 monooxygenase [Acinetobacter sp.(2000) SE19] 2 chnB >gb|AAG10021.1|AF282240_5 9 10 53 67 e−163 Cheng,Q., et al. J. Rhodococcus (AF282240) cyclohexanone Bacteriol. 182:4744-4751 sp. phi 2 monooxygenase [Acinetobacter sp. (2000) SE19] 3chnB >gb|AAG10021.1|AF282240_5 11 12 57 72 e−106 Cheng, Q., et al. J.Arthrobacter (AF282240) cyclohexanone Bacteriol. 182: 4744-4751 sp. BP2monooxygenase [Acinetobacter sp. (2000) SE19] 4 chnB1 >pir∥JC7158steroid monooxygenase 13 14 44 59 e−122 Morii, S., et al. J.Brevibacterium (EC 1.14.99.—) - Rhodococcus Biochem. 126 (3): 624-631sp. HCU rhodochrous dbj|BAA24454.1| (1999) (AB010439) steroidmonooxygenase [Rhodococcus rhodochrous] 5 chnB2 >pir∥JC7158 steroidmonooxygenase 15 16 38 53 2e−94  Morii, S., et al. J. Brevibacterium (EC1.14.99.—) - Rhodococcus Biochem. 126 (3): 624-631 sp. HCU rhodochrousdbj|BAA24454.1| (1999) (AB010439) steroid monooxygenase [Rhodococcusrhodochrous] 6 chnB >gb|AAG10021.1|AF282240_5 17 18 57 73 0.0 Cheng, Q.,et al. J. Acidovorax (AF282240) cyclohexanone Bacteriol. 182: 4744-4751sp. CHX monooxygenase [Acinetobacter sp. (2000) SE19] 7chnB >dbj|BAA86293.1| (AB006902) 19 20 99 99 0.0 Chen, Y. C., et al. J.Acinetobacter cyclohexanone 1,2-monooxygenase Bacteriol. 170 (2):781-789 sp. SE19 [Acinetobacter sp.] dbj|BAB61738.1| (1988) (AB026668)cyclohexanone 1,2- monooxygenase [Acinetobacter sp. NCIMB9871] 8 ORF 8chnB >pir∥T37052 probable flavin-containing 21 22 37 50 6e−58 Seeger, K.J., et al. Rhodococcus monooxygenase - Streptomyces Direct Submission(??- erythropolis coelicolor AUG-1999) to the AN12 emb|CAB52349.1|(AL109747) putative EMBL Data Library flavin-containing monooxygenase[Streptomyces coelicolor A3(2)] 9 ORF 9 chnB >emb|CAB59668.1| (AL132674)23 24 44 61 e−118 Redenbach, M., et al. Rhodococcus monooxygenase.[Streptomyces Mol. Microbiol. 21 (1): erythropolis coelicolor A3(2)]77-96 (1996) AN12 10 ORF 10 chnB >pir∥JC7158 steroid monooxygenase 25 2664 76 0.0 Morii, S., et al. J. Rhodococcus (EC 1.14.99.—) - RhodococcusBiochem. 126 (3), 624-631 erythropolis rhodochrous (1999) AN12dbj|BAA24454.1| (AB010439) steroid monooxygenase [Rhodococcusrhodochrous] 11 ORF 11 chnB >gb|AAK22759.1| (AE005753) 27 28 65 74 e−176Nierman, W. C., et al. Rhodococcus monooxygenase, flavin-binding familyProc. Natl. Acad. Sci. erythropolis [Caulobacter crescentus] U.S.A. 98(7): 4136-4141 AN12 (2001) 12 ORF 12 chnB >emb|CAB59668.1| (AL132674) 2930 45 63 e−124 Redenbach, M., et al. Rhodococcus monooxygenase.[Streptomyces Mol. Microbiol. 21 (1): erythropolis coelicolor A3(2)]77-96 (1996) AN12 13 ORF 13 chnB >gb|AAK24539.1| (AE005925) 31 32 55 68e−159 Nierman, W. C., et al. Rhodococcus monooxygenase, flavin-bindingfamily Proc. Natl. Acad. Sci. erythropolis [Caulobacter crescentus]U.S.A. 98 (7): 4136-4141 AN12 (2001) 14 ORF 14 chnB >pir∥JC7158 steroidmonooxygenase 33 34 51 65 e−154 Morii, S., et al. J. Rhodococcus (EC1.14.99.—) - Rhodococcus Biochem. 126 (3), 624-631 erythropolisrhodochrous (1999) AN12 dbj|BAA24454.1| (AB010439) steroid monooxygenase[Rhodococcus rhodochrous] 15 ORF 15 chnB >sp|P55487|Y4ID_RHISN PROBABLE35 36 39 58 e145 Freiberg, C. A., et al. Rhodococcus MONOOXYGENASE Y4IDNature 387: 394-401 erythropolis gb|AAB91699.1| (AE000078) Y4iD (1997).AN12 [Rhizobium sp. NGR234] 16 ORF 16 chnB >pir∥A83453 probableflavin-containing 37 38 43 59 e−119 Stover, C. K., et al. Rhodococcusmonooxygenase PA1538 [imported] - Nature 406 (6799): erythropolisPseudomonas aeruginosa (strain PAO1) 959-964 (2000) AN12gb|AAG04927.1|AE004582_5 (AE004582) probable flavin-containingmonooxygenase [Pseudomonas aeruginosa] 17 ORF 17 chnB >pir∥G70852hypothetical protein Rv3083 - 39 40 53 70 e−150 Cole, S. T., et al.Rhodococcus Mycobacterium tuberculosis (strain Nature 393 (6685):erythropolis H37RV) 537-544 (1998) AN12 emb|CAA16141.1| (AL021309)hypothetical protein Rv3083 [Mycobacterium tuberculosis] gb|AAK47504.1|(AE007134) monooxygenase, flavin-binding family [Mycobacteriumtuberculosis CDC1551] 18 ORF 18 chnB >pir∥A83453 probableflavin-containing 41 42 44 60 e−117 Stover, C. K., et al. Rhodococcusmonooxygenase PA1538 [imported] - Nature 406 (6799): erythropolisPseudomonas aeruginosa (strain PAO1) 959-964 (2000) AN12gb|AAG04927.1|AE004582_5 (AE004582) probable flavin-containingmonooxygenase [Pseudomonas aeruginosa] 19 ORF 19chnB >gb|AAG10021.1|AF282240_5 43 44 54 69 e−168 Cheng, Q., et al. J.Rhodococcus (AF282240) cyclohexanone Bacteriol. 182 (17): erythropolismonooxygenase [Acinetobacter sp. 4744-4751 (2000) AN12 SE19] 20 ORF 20chnB >pir∥JC7158 steroid monooxygenase 45 46 42 60 e−123 Morii, S., etal. J. Rhodococcus (EC 1.14.99.—) - Rhodococcus Biochem. 126 (3):624-631 erythropolis rhodochrous (1999) AN12 dbj|BAA24454.1| (AB010439)steroid monooxygenase [Rhodococcus rhodochrous]

Example 7 Cloning and Expression Of Monooxygenase Genes into Escherichiacoli

[0260] This example illustrates the expression in E. coli of isolatedfull length BVMO genes from Brevibacterium sp. HCU, Acinetobacter SE19,Rhodococcus sp. phi1, Rhodococcus sp. phi2, Arthrobacter sp. BP2 andAcidovorax sp. CHX.

[0261] Full length BVMO's were PCR amplified, using chromosomal DNA asthe template and the primers shown below in Table 4. TABLE 4 PrimersUsed for Amplification of Full-Length BV Monooxygenases MonooxygenaseForward Primer Reverse Primer Brevibacterium sp.atgccaattacacaacaacttgacc (SEQ ID NO:68) ctatttcatacccgccgattcac (SEQ IDNO:69) HCU chnB1 Brevibacterium sp. atgacgtcaaccatgcctgcac (SEQ IDNO:70) cacttaagtcgcattcagccc (SEQ ID NO:71) HCU chnB2 Acinetobacter sp.atggattttgatgctatcgtg (SEQ ID NO:72) ggcattggcaggttgcttg (SEQ ID NO:73)SE19 chnB Arthrobacter sp. atgactgcacagaacactttcc (SEQ ID NO:74)tcaaagccgcggtatccg (SEQ ID NO:75) BP2 chnB Rhodococcus sp.atgactgcacagatctcacccac (SEQ ID NO:76) tcaggcggtcaccgggacagcg (SEQ IDNO:77) phi1 chnB Rhodococcus sp. atgaccgcacagaccatccacac (SEQ ID NO:78)tcagaccgtgaccatctcgg (SEQ ID NO:79) phi2 chnB Acidovorax sp. CHXatgtcttcctcgccaagcagc (SEQ ID NO:80) cagtggttggaacgcaaagcc (SEQ IDNO:81) chnB

[0262] Following amplification, the chnB gene fragments were cloned intopTrcHis-TOPO TA vectors with either an N-terminal tail or C-terminaltail, as provided by the vector sequence (N-terminal tail forBrevibacterium sp. HCU, Rhodococcus sp. phi1, Rhodococcus sp. phi2, andArthrobacter sp. BP2 monooxygenases; C-terminal tail for Acinetobactersp. SE19 and Acidovorax sp. CHX monooxygenases). These vectors weretransformed into E. coli, with transformants grown in Luria-Bertanibroth supplemented with ampicillin (100 ug/ml) and riboflavin (0.1ug/ml) at 30° C. until the absorbance at 600 nm (A600) reached 0.5. Whenthe A600 was reached, the temperature was shifted to 16° C.

[0263] The encoded monooxygenase sequences were expressed upon additionof IPTG to the culture media, 30 min after the temperature shift to 16°C. The cultures were grown further ovemight (14 hrs) and harvested bycentrifugation in a cold centrifuge. The cells were treated withlysozyme (100 mg/ml) for 30 min on ice and sonicated. Followingsonication, cell extracts were centrifuged and the supernatant wasequilibrated with Ni-NTA resin (Qiagen, Valencia, Calif.) for 1 hr at 4°C. Protein bound resin was washed successively with increasingconcentrations of imidazole buffer until the protein of interest wasreleased from the resin. The purified protein was concentrated and thebuffer exchanged to remove the imidazole. The protein concentration wasadjusted to 1 ug/ml.

Example 8 Assays of chnB Monooxyqenase Activities of Brevibacterium sp.HCU, Acinetobacter SE19, Rhodococcus sp. phi1, Rhodococcus sp. phi2,Arthrobacter sp. BP2 and Acidovorax sp. CHX.

[0264] The chnB monooxygenase activity of each over-expressed enzymefrom Example 7 was assayed against various ketone substrates:cyclobutanone, cyclopentanone, 2-methylcyclopentanone, cyclohexanone,2-methylcyclohexanone, cyclohex-2-ene-1-one, 1,2-cyclohexanedione,1,3-cyclohexanedione, 1,4-cyclohexanedione, cycloheptanone,cyclooctanone, cyclodecanone, cycloundodecanone, cyclododecanone,cyclotridecanone, cyclopentadecanone, 2-tridecanone,2-phenylcyclohexanone, diheyl ketone, norcamphor, beta-ionone, oxindole,levoglucosenone, dimethyl sulfoxide, dimethyl-2-piperidone, andphenylboronic acid. Compounds were selected on the basis of previousobservations by van der Werf (J. Biochem. 347:693-701 (2000)) andMiyamoto et al. (Biochimica et Biophysica Acta 1251: 115-124 (1995)) andby searches for the ketone substructure.

[0265] All compounds were obtained from Sigma-Aldrich with only twoexceptions. Levoglucosenone was obtained from Toronto ResearchChemicals, Inc. and dimethyl-2-piperidone was prepared according to U.S.Pat. No. 6,077,955. For enzyme assays all compounds were dissolved to aconcentration of 0.1 M in methanol, with the exceptions of norcamphor(dissolved in ethyl acetate), cyclododecanone, cycltridecanone andcyclopentadecanone (dissolved in propanol), and levoglucosenone(dissolved with acetone).

[0266] The monooxygenase activity of each over-expressed enzyme wasassayed spectrophotometrically at 340 nm by monitoring the oxidation ofNADPH. Assays were performed in individual quartz cuvettes, with apathlength of 1 cm. The following components were added to the cuvettefor the enzyme assays: 380 ul of 33.3 mM MES-HEPES-sodium acetate buffer(pH 7.5), 5 μl of 0.1 M substrate (1.25 mM final concentration), 10 μlof 1 μg/μl enzyme solution (10 ng total, 0.025 ng/μl) and 5 ul NADPH(1.2 M, 15 mM final concentration ). An Ultrospec 4000 (PharmaciaBiotech, Cambridge, England) was used to read the absorbance of thesamples over a two to ten minute time period and the SWIFT (PharmaciaBiotech) program was used to calculate the slope of the reduction inabsorbance over time. For the Brevibactedum sp. HCU chnB2, the rateswere multiplied by a factor of 3.25 to adjust for decrease in activitydue to storage as suggested by the literature (J. Bacteriol. 2000. 182:p.4241-4248). Monooxygenase activity of each over-expressed enzyme isshown in Table 5, with respect to each ketone substrate. The specificactivity values listed are given in umol/min/mg. The notation “ND”refers to “No Activity Detected”.

[0267] Graphical representation of the data shown in Table 5 is alsoprovided in FIGS. 1, 2, 3, 4, and 5. TABLE 5 Specific Activity ofMonooxygenase Enzymes Against Various Ketone Substrates Species sp. sp.sp. sp. sp. sp. sp. HCU HCU SE19 BP2 CHX phi1 phi2 Compound chnB1 chnB2chnB chnB chnB chnB chnB Norcamphor 0.410 1.331 4.474 2.842 0.166 1.5042.816 Cyclobutanone ND 0.374 0.109 0.128 ND 0.102 0.154 CyclopentanoneND 1.331 3.034 1.491 0.621 1.370 2.451 2-methyl- 1.395 0.874 8.378 3.5140.627 3.392 6.445 cyclopentanone Cyclohexanone 2.765 1.726 6.349 3.5650.397 3.680 3.750 2-methyl- 2.714 1.622 9.990 4.205 0.627 4.774 5.952cyclohexanone Cyclohex-2-ene- 0.435 0.541 5.357 2.739 0.666 2.694 3.0911-one 1,2- 0.787 0.416 0.077 0.237 0.096 0.083 ND cyclohexanedione 1,3-0.237 0.978 0.237 0.397 0.032 ND 0.141 cyclohexanedione 1,4- 3.405 1.1238.346 3.994 0.794 3.302 6.150 cyclohexanedione Cycloheptanone 0.6460.374 8.422 3.846 0.608 3.622 6.234 Cyclooctanone ND ND 1.984 0.6460.410 0.627 0.141 Cyclodecanone ND ND 0.320 0.166 0.160 0.077 0.205Cycloundecanone ND 0.125 0.064 0.064 0.058 ND 0.051 Cyclododecanone ND0.229 0.122 0.198 0.051 ND 0.122 Cyclotridecanone ND ND 0.166 0.147 NDND 0.109 Cyclopenta- ND ND 0.109 0.122 ND 0.122 ND decanone2-tridecanone ND 0.187 ND ND 0.096 0.160 1.690 dihexyl ketone ND 0.270ND ND ND 0.160 ND 2-phenyl- 1.459 0.104 5.370 ND 0.192 1.050 0.730cyclohexanone Oxindole 2.438 0.229 7.091 4.845 0.307 3.411 4.858Levoglucosenone ND ND 1.126 0.525 0.147 0.461 0.506 dimethyl 0.230 ND0.819 0.422 0.358 0.518 0.544 sulfoxide dimethy-2- 2.822 0.354 8.3844.154 0.557 3.539 6.509 piperidone Phenylboronic 1.606 ND 0.102 0.192 NDND 0.109 acid beta-ionone 0.109 0.374 3.347 1.485 0.544 2.707 0.544

Example 9 Cloning of Rhodococcus erythropolis AN12 Monooxygenase Genesinto Escherichia coli

[0268] This example illustrates the construction of a suite ofrecombinant E. coli, each containing a full length BVMOs fromRhodococcus erythropolis AN12.

[0269] Full length BV monooxygenases were PCR amplified, usingchromosomal DNA as the template and the primers shown below in Table 6.TABLE 6 Primers Used for Amplification of Full-Length BV Rhodococcuserythropolis AN12 Monooxygenases chnB Mono- oxygenase Forward PrimerReverse Primer ORF 8 atg agc aca gag ggc aag tac (SEQ ID NO:82)[tca] gtc ctt gtt cac gta (SEQ ID NO:83) gc gta ggc c ORF 9 atg gtc gacatc gac cca acc (SEQ ID NO:84) tta tcg gct cct cac ggt ttc (SEQ IDNO:85) tc tcg ORF 10 atg acc gat cct gac ttc tcc (SEQ ID NO:86) tca tgcgtg cac cgc act gtt (SEQ ID NO:87) acc cag ORF 11 atg agc ccc tcc cccttg ccg (SEQ ID NO:88) tca tgc gcg atc cgc ctt ctc (SEQ ID NO:89) ag gagORF 12 gtg aac aac gaa tct gac cac (SEQ ID NO:90) tca tgc ggt gta ctccgg ttc (SEQ ID NO:91) ttc cg ORF 13 atg agc acc gaa cac ctc gat (SEQ IDNO:92) tca act ctt gct cgg tac cgg (SEQ ID NO:93) g cg ORF 14 atg acagac gaa ttc gac gta (SEQ ID NO:94) tca gct ctg gtt cac agg gac (SEQ IDNO:95) gtg at gg ORF 15 atg gcg gag ata gtc aat ggt (SEQ ID NO:96) tcaccc tcg cgc ggt cgg agt (SEQ ID NO:97) cc c ORF 16 gtg aag ctt ccc gaacat gtc (SEQ ID NO:98) tca tgc ctg gac gct ttc gat (SEQ ID NO:99) gaa acctt g ORF 17 atg aca cag cat gtc gac gta (SEQ ID NO:100) cta tgc gct ggcgac ctt gct (SEQ ID NO:101) ctg a atc ORF 18 atg tca tca cgg gtc aac gac(SEQ ID NO:102) tca tcc ttt gcc tgt cgt cag (SEQ ID NO:103) ggc c tgcORF 19 atg act aca caa aag gcc ctg (SEQ ID NO:104) tca ggc gtc gac ggtgtc ggc (SEQ ID NO:105) acc c ORF 20 atg aca act acc gaa tcc aga (SEQ IDNO:106) tca gcg cag att gaa gcc ctt (SEQ ID NO:107) act c gta tc

[0270] Following amplification, the gene fragments were cloned intopTrcHis-TOPO TA vectors with either an N-terminal tail or C-terminaltail, as provided by the vector sequence. These vectors were transformedinto E. coli, with transformants grown in Luria-Bertani brothsupplemented with ampicillin (100 ug/ml).

Example 10 Assays of chnB Monooxycenase Activities of Rhodococcuserythropolis AN12

[0271] The chnB monooxygenase activity of each expressed enzyme fromExample 9 was tested for activity according to its ability to convertcyclohexanone to caprolactone.

[0272] Conversion of Cyclohexanone to Caprolactone.

[0273] Clones containing the full length monooxygenase genes weretransferred from LB agar plate to 5 mL of M63 minimal media (GIBCO)containing 10 mM glycerol, 50 ug/mL ampicillin, 0.1 mM IPTG, and 500mg/L cyclohexanone. In addition to the clones containing full lengthmonooxygenases, a plasmid without an insert and a “no cell” control werealso assayed. The encoded monooxygenase sequences were expressed uponaddition of IPTG to the culture media. The cultures were incubatedovernight at room temperature (24° C.). Samples (1.25 mL) for analysiswere taken immediately after inoculation and after overnight incubation;cells were removed by centrifugation (4° C., 13,000 rpm).

[0274] GC-MS Detection of Caprolactone

[0275] Caprolactone formed by the action of the cloned monooxygenase wasextracted from the aqueous phase with ethylacetate (1.0 ml aqueous/0.5mL ethylacetate). Caprolactone was detected by gas chromotagraphy massspectrometry (GC-MS) analysis, using an Agilent 6890 Gas chromatographsystem.

[0276] The analysis of the ethylacetate phase was performed by injecting1 uL of the ethyl acetate phase into the GC. The inlet temperature was115° C. and the column temperature profile was 50° C. for 4 min andramped to 250° C. at 20° C./min, for a total run time of 14 min. Thecompounds were separated with an Hewlet Packard HP-5MS (5% phenyl MethylSiloxane) column (30 m length, 250 um diameter, and 0.25 um filmthickness). The mass spectrometer was run in Electron Ionization mode.The background mass spectra was subtracted from the spectra at theretention time of caprolactone (9.857 min). Presence of caprolactone wasconfirmed by comparison of the test reactions to an authentic standardobtained from Aldrich Chemical Company (St. Louis, Mo.).

[0277] Results of these assays are shown below in Table 7, in terms ofthe presence or absence of detectable caprolactone formation accordingto the activity of each expressed BV monooxygenase enzyme. TABLE 7Ability of Monooxygenase Enzymes to Convert Cyclohexanone toCaprolactone Formation of Caprolactone Detected Not Detected Not AssayedchnB ORF 8 ORF 15 ORF 10 Monooxygenases ORF 9 No cell control ORF 13 ORF11 Plasmid control ORF 14 ORF 12 ORF 20 ORF 16 ORF 17 ORF 18 ORF 19

Example 11 Identification of Signature Sequences between Families of BVMonooxygenases

[0278] Sequence analysis of the 20 genes encoding Baeyer-Villigermonooxygenases identified in the previous examples allows definition ofthree different BV signature sequence families based on amino acidsimilarities. Each family possesses several member genes for whichbiochemical validation of the enzyme as a functional BV enzyme capableof the oxidation of cyclohexanone was demonstrated (Examples, supra).Sequence alignment of the homologues for each family was performed byClustal W alignment (Higgins and Sharp (1989) CABIOS. 5:151-153). Thisallows the identification of a set of amino acids that are conserved atspecific positions in the alignment created from all the sequencesavailable.

[0279] The results of these Clustal W alignments are shown in FIGS. 7,8, and 9 for BV Family 1, BV family 2, and BV Family 3. In all cases, an“*” indicates a conserved signature amino acid position. The conservedamino acid signature sequence for each Family is shown in FIG. 6, alongwith the signature sequence P-# positions. This conserved aminoacid/position set becomes a signature for each family. Any new proteinwith a sequence that can be aligned with those of the existing membersof the family and which includes at the specific positions a at least80% of the signature sequence amino acids can be considered a member ofthe specific family.

[0280] BV Family 1

[0281] This family comprises the chnB monooxygenase sequences ofArthrobacter sp. BP2 (SEQ ID NO:12), Rhodococcus sp. phi1 (SEQ ID NO:8),Rhodococcus sp. phi2 (SEQ ID NO:10), Acidovorax sp. CHX (SEQ ID NO:14),Brevibacterium sp. HCU (SEQ ID NOs:16 and 18), and Rhodococcuserythropolis AN12 ORF10, ORF14, ORF19, and ORF20 (SEQ ID NOs:26, 34, 44and 46). Within a length of 540 amino acids, a total of 74 positions areconserved (100%). This signature sequence of Family 1 BV monooxygenasesis shown beneath each alignment of proteins (FIG. 7) and is listed asSEQ ID NO:47. The ability to identify the signature sequence within thisfamily of proteins was made possible by: 1) the number of sequences ofBV monooxygenases; and 2) the characterization of their activity asBV-monooxygenases.

[0282] Based on the limited number (4 total) of BV monooxygenasesequences in the public domain, for which biochemical data is alsoavailable, 3 of these sequences align with the signature sequencediscovered for Family 1. These sequences are:

[0283] (1) Acinetobacter sp. NCIMB9871 chnB (NCBI Accession NumberAB026668, based on Chen, Y.C. et al. (J Bacteriol. 170(2):781-789(1988)). Key biochemical characterization of this protein was performedby Donogue et al. (Eur J Biochem. 16:63(1):175-92 (1976)), Trudgill etal, (Methods Enzymol. 188:70-77 (1990)), and Iwaki et al. (Appl EnvironMicrobiol. 65(11):5158-62 (1999)). This enzyme shares 72 of the 74conserved amino acids in the signature sequence of Family 1 BVmonooxygenases.

[0284] (2) Rhodococcus erythropolis limB (NCBI Accession NumberAJ272366, based on the work of Barbirato et al. (FEBS Lett. 438 (3):293-296 (1998)) and van der Werf et al. (Biol. Chem. 274 (37):26296-26304 (1999)). Key biochemical characterization of this proteinwas performed by van der Werf, M,J. et al. (Microbiology 146 (Pt5):1129-41 (2000); Biochem J. 1;347 Pt 3:693-701 (2000); and ApplEnviron Microbiol. 65(5):2092-102 (1999)). This enzyme is known as acarvone monooxygenase

[0285] (3) Rhodococcus rhodochrous smo (NCBI Accession Number AB010439).This enzyme was sequenced and characterized by Morii, S. et al. (J.Biochem. 126 (3), 624-631 (1999)). This enzyme is known as a steroidmonooxygenase. It shares 74 of the 74 conserved amino acids in thesignature sequence of Family 1 BV monooxygenases.

[0286] The enzymes described in the public domain having the highestsequence similarity to Group 1 have been characterized asdimethylaniline hydroxylases.

[0287] BV Family 2

[0288] This family comprises the chnB monooxygenase sequences ofRhodococcus erythropolis AN12 ORF9, ORF12, ORF15, ORF 16, and ORF18 (SEQID NOs:24, 30, 36, 38, and 42). Within a length of 497 amino acids, atotal of 76 positions are conserved (100%). This signature sequence forFamily 2 BV monooxygenases is shown beneath each alignment of proteins(FIG. 8) and is listed as SEQ ID NO:48. The ability to identify thesignature sequence within this family of proteins was made possibleby: 1) the number of sequences of BV monooxygenases; and 2) thecharacterization of their activity as BV-monooxygenases.

[0289] Based on the limited number (4 total) of BV monooxygenasesequences in the public domain, for which biochemical data is alsoavailable, only 1 of these sequences align with the signature sequencediscovered for Family 2. This sequence is Pseudomonas putida JD1 Keybiochemical characterization of this protein was performed by Tanner A.,et al. (J Bacteriol. 182(23):6565-6569 (2000)). This enzyme is known asan acetophenone monooxygenase. It shares 69 of the 76 conserved aminoacids in the signature sequence of Family 2 BV monooxygenases.

[0290] BV Family 3

[0291] This family comprises the chnB monooxygenase sequences ofRhodococcus erythropolis AN12 ORF8, ORF 11, ORF 13, and ORF17 (SEQ IDNOs:22, 28, 32, and 40). Within a length of 471 amino acids, a total of41 positions are conserved (100%). This signature sequence for Family 3BV monooxygenases is shown beneath each alignment of proteins (FIG. 9)and is listed as SEQ ID NO:49. The ability to identify the signaturesequence within this family of proteins was made possible by: 1) thenumber of sequences of BV monooxygenases; and 2) the characterization oftheir activity as BV-monooxygenases.

[0292] There are no sequences in the public domain with demonstrated BVactivity that belong to this group. The dimethylaniline Noxidase sharesonly 30 amino acids out of 41 conserved amino acids discovered in thesignature sequence, which represents less than 80% of the conservedpositions.

1 113 1 791 DNA Arthrobacter sp. BP2 1 accaccttcg acggctcccc cccacaagggttaggccacc ggcttcgggt gttaccaact 60 ttcgtgactt gacgggcggt gtgtacaaggcccgggaacg tattcaccgc agcgttgctg 120 atctgcgatt actagcgact ccgacttcatggggtcgagt tgcagacccc aatccgaact 180 gagaccggct ttttgggatt agctccacctcacagtatcg caaccctttg taccggccat 240 tgtagcatgc gtgaagccca agacataaggggcatgatga tttgacgtcg tccccacctt 300 cctccgagtt gaccccggca gtctcctatgagtccccggc cgaaccgctg gcaacataga 360 acgagggttg cgctcgttgc gggacttaacccaacatctc acgacacgag ctgacgacaa 420 ccatgcacca cctgtaaacc ggccgcaagcggggcacctg tttccaggtc tttccggtcc 480 atgtcaagcc ttggtaaggt tcttcgcgttgcatcgaatt aatccgcatg ctccgccgct 540 tgtgcgggcc cccgtcaatt cctttgagttttagccttgc ggccgtactc cccaggcggg 600 gcacttaatg cgttagctac ggcgcggaaaacgtggaatg tcccccacac ctagtgccca 660 acgtttacgg catggactac cagggtatctaatcctgttc gctccccatg ctttcgctcc 720 tcagcgtcag ttacagccca gagacctgcctttgccatcg gtgttcctct tgatatctgc 780 gcatttcacc g 791 2 1303 DNARhodococcus sp. phi1 2 gtgcttaaca catgcaagtc gaacgatgaa gcccagcttgctgggtggat tagtggcgaa 60 cgggtgagta acacgtgggt gatctgccct gcactctgggataagcctgg gaaactgggt 120 ctaataccgg atatgacctc gggatgcatg tcctggggtggaaagttttt cggtgcagga 180 tgagcccgcg gcctatcagc ttgttggtgg ggtaatggcctaccaaggcg acgacgggta 240 gccggcctga gagggcgacc ggccacactg ggactgagacacggcccaga ctcctacggg 300 aggcagcagt ggggaatatt gcacaatggg cgcaagcctgatgcagcgac gccgcgtgag 360 ggatgacggc cttcgggttg taaacctctt tcacccatgacgaagcgcaa gtgacggtag 420 tgggagaaga agcaccggcc aactacgtgc cagcagccgcggtaatacgt aggtgcgagc 480 gttgtccgga attactgggc gtaaagagct cgtaggcggtttgtcgcgtc gtctgtgaaa 540 tcccgcagct caactgcggg cttgcaggcg atacgggcagactcgagtac tgcaggggag 600 actggaattc ctggtgtagc ggtgaaatgc gcagatatcaggaggaacac cggtggcgaa 660 ggcgggtctc tgggcagtaa ctgacgctga ggagcgaaagcgtgggtagc gaacaggatt 720 agataccctg gtagtccacg ccgtaaacgg tgggcgctaggtgtgggttt ccttccacgg 780 gatccgtgcc gtagccaacg cattaagcgc cccgcctggggagtacggcc gcaaggctaa 840 aactcaaagg aattgacggg ggcccgcaca agcggcggagcatgtggatt aattcgatgc 900 aacgcgaaga accttacctg ggtttgacat gtaccggacgactgcagaga tgtggtttcc 960 cttgtggccg gtagacaggt ggtgcatggc tgtcgtcagctcgtgtcgtg agatgttggg 1020 ttaagtcccg caacgagcgc aacccttgtc ctgtgttgccagcacgtgat ggtggggact 1080 cgcaggagac tgccggggtc aactcggagg aaggtggggacgacgtcaag tcatcatgcc 1140 ccttatgtcc agggcttcac acatgctaca atggtcggtacagagggctg cgataccgtg 1200 aggtggagcg aatcccttaa agccggtctc agttcggatcggggtctgca actcgacccc 1260 gtgaagtcgg agtcgctagt aatcgcagat cagcaacgctgcg 1303 3 1296 DNA Rhodococcus sp. phi2 3 gcttaacaca tgcaagtcgaacgatgaagc ccagcttgct gggtggatta gtggcgaacg 60 ggtgagtaac acgtgggtgatctgccctgc acttcgggat aagcctggga aactgggtct 120 aataccggat aggacctcgggatgcatgtt ccggggtgga aaggttttcc ggtgcaggat 180 gggcccgcgg cctatcagcttgttggtggg gtaacggccc accaaggcga cgacgggtag 240 ccggcctgag agggcgaccggccacactgg gactgagaca cggcccagac tcctacggga 300 ggcagcagtg gggaatattgcacaatgggc gcaagcctga tgcagcgacg ccgcgtgagg 360 gatgacggcc ttcgggttgtaaacctcttt cagtaccgac gaagcgcaag tgacggtagg 420 tacagaagaa gcaccggccaactacgtgcc agcaagccgc ggtaatacgt aaggtgcgaa 480 gcgttgtccg gaattactgggcgtaaagag ctcgtaggcg gtttgtcgcg tcgtctgtga 540 aaacccgcag ctcaactgcgggcttgcagg cgatacgggc agacttgagt actgcagggg 600 agactggaat tcctggtgtagcggtgaaat gcgcagatat caggaggaac accggtggcg 660 aaggcgggtc tctgggcagtaactgacgct gaggagcgaa agcgtgggta gcgaacagga 720 ttagataccc tggtagtccacgccgtaaac ggtgggcgct aggtgtgggt ttccttccac 780 gggatccgtg ccgtagctaacgcattaagc gccccgcctg gggagtacgg ccgcaaggct 840 aaaactcaaa ggaattgacgggggcccgca caagcggcgg agcatgtgga ttaattcgat 900 gcaacgcgaa gaaccttacctgggtttgac atacaccgga ccgccccaga gatggggttt 960 cccttgtggt cggtgtacaggtggtgcatg gctgtcgtca gctcgtgtcg tgagatgttg 1020 ggttaagtcc cgcaacgagcgcaacccttg tcctgtgttg ccagcacgta atggtgggga 1080 ctcgcaggag actgccggggtcaactcgga ggaaggtggg gacgacgtca agtcatcatg 1140 ccccttatgt ccagggcttcacacatgcta caatggccgg tacagagggc tgcgataccg 1200 cgaggtggag cgaatcccttaaagccggtc tcagttcgga tcggggtctg caactcgacc 1260 ccgtgaagtc ggagtcgctagtaatcgcag atcagc 1296 4 1388 DNA Brevibacterium sp. HCU 4 cgcccttgagtttgatcctg gctcaggacg aacgctggct gcgtgcttaa cacatgcaag 60 tcgaacgctgaagccgacag cttgctgttg gtggatgagt ggcgaacggg tgagtaacac 120 gtgagtaacctgcccctgat ttcgggataa gcctgggaaa ctgggtctaa taccggatac 180 gaccacctgacgcatgttgg gtggtggaaa gtttttcgat cggggatggg ctcgcggcct 240 atcagcttgttggtggggta atggcctacc aaggcgacga cgggtagccg gcctgagagg 300 gcgaccggccacactgggac tgagacacgg cccagactcc tacgggaggc agcagtgggg 360 aatattgcacaatgggggaa accctgatgc agcgacgcag cgtgcgggat gacggccttc 420 gggttgtaaaccgctttcag cagggaagaa gcgaaagtga cggtacctgc agaagaagta 480 ccggctaactacgtgccagc agccgcggta atacgtaggg tacgagcgtt gtccggaatt 540 attgggcgtaaagagctcgt aggtggttgg tcacgtctgc tgtggaaacg caacgcttaa 600 cgttgcgcgtgcagtgggta cgggctgact agagtgcagt aggggagtct ggaattcctg 660 gtgtagcggtgaaatgcgca gatatcagga ggaacaccgg tggcgaaggc gggactctgg 720 gctgtaactgacactgagga gcgaaagcat ggggagcgaa caggattaga taccctggta 780 gtccatgccgtaaacgttgg gcactaggtg tgggggacat tccacgttct ccgcgccgta 840 gctaacgcattaagtgcccc gcctggggag tacggtcgca aggctaaaac tcaaaggaat 900 tgacgggggcccgcacaagc ggcggagcat gcggattaat tcgatgcaac gcgaagaacc 960 ttaccaaggcttgacataca ctggaccgtt ctggaaacag ttcttctctt tggagctggt 1020 gtacaggtggtgcatggttg tcgtcagctc gtgtcgtgag atgttgggtt aagtcccgca 1080 acgagcgcaaccctcgttct atgttgccag cacgtgatgg tgggaactca taggagactg 1140 ccggggtcaactcggaggaa ggtggggatg acgtcaaatc atcatgccct ttatgtcttg 1200 ggcttcacgcatgctacaat ggctggtaca gagagaggcg aacccgtgag ggtgagcgaa 1260 tcccttaaagccagtctcag ttcggatcgt agtctgcaat tcgactacgt gaagtcggag 1320 tcgctagtaatcgcagatca gcaacgctgc ggtgaatacg ttcccgggcc ttgtacacac 1380 cgcccgta1388 5 895 DNA Brachymonas sp. CHX 5 taggctaact acttctggca gaacccgctcccatggtgtg acgggcggtg tgtacaagac 60 ccgggaacgt attcaccgcg acatgctgatccgcgattac tagcgattcc gacttcacgc 120 agtcgagttg cagactgcga tccggactacgaccggcttt gtgggattgg ctccccctcg 180 cgggttggct accctctgta ccggccattgtatgacgtgt gtagccccac ctataagggc 240 catgaggact tgacgtcatc cccaccttcctccggtttgt caccggcagt cccattagag 300 tgccctttcg tagcaactaa tggcaagggttgcgctcgtt gcgggactta acccaacatc 360 tcacgacacg agctgacgac agccatgcagcacctgtgtg caggttctct ttcgagcact 420 cccaaatctc ttcaggattc ctgccatgtcaaaggtgggt aaggtttttc gcgttgcatc 480 gaattaaacc acatcatcca ccgcttgtgcgggtccccgt caattccttt gagtttcaac 540 cttgcggccg tactccccag gcggtcaacttcacgcgttg gcttcgttac tgagtcagct 600 aagacccaac aaccagttga catcgtttagggcgtggact accagggtat ctaatcctgt 660 ttgctcccca cgctttcgtg catgagcgtcagtgcaggcc caggggattg ccttcgccat 720 cggtgttcct ccgcatatct acgcatttcactgctacacg cggaattcca tccccctctg 780 ccgcactcca gctttgcagt cacaaaggcagttcccaggt tgagcccggg gatttcacct 840 ctgtcttaca aaaccgcctg cgcacgctttacgcccagta attccgatga acgct 895 6 1439 DNA Rhodococcus erythropolis AN12misc_feature (1417)..(1417) N = G or A or T or C 6 aaaacgctgg gcgggcgttgcttaacacat gcaattcgag cggtaaggcc tttcggggta 60 cacaagcggc gaacgggtgagtaacacgtg ggtgatctgc cctgcacttc gggataagcc 120 tgggaaactg ggtctaataccggatatgac ctcaggtcgc atgacttggg gtggaaaaat 180 ttatcggtgc aggatgggcccgcggcctat cagcttgttg gtggggtaat ggcctaccaa 240 ggcgacaacg ggtacccgacctgaaagggt gaccggccac actgggactg aaacacggcc 300 caaactccta cgggaggcagcagtggggaa tattgcacaa tgggcgaaag cctgatgcac 360 cgaccccgcg tgagggatgacggccttcgg gttgtaaacc tctttcagca gggacaaacg 420 caagtgacgg tacctgcagaagaagccccg gctaactacg tgccagcagc cgcggtatta 480 cttagggtgc aagcgttgtccggaattact gggcgtaaag agttcgtacg cggtttgtcg 540 cgtcgtttgt gaaaaccagcagctcaactg ctggcttgca ggcgatacgg gcagacttga 600 gtactgcagg ggagactggaattcctggtg tagcggtgaa atgcgcagat atcaggagga 660 acaccggtgg cgaaggcgggtctctgggca ctaactgacg ctgaggaacg aaagcgtggg 720 tagcgaacag gattacataccctggtagtc cacgccgtaa acggtgggcg ctaggtgtgg 780 gttccttcca cggaatccgtgccgtagcta acgcattaag cgccccgcct ggggagtacg 840 gccgcaaggc taaaactcaaaggaattgac gggggcccgc acaatcggcg gaacatgtgg 900 attaattcga tgcaacgcgaagaaccttac tgggtttgac atataccgga aagctgcaga 960 gatgtggccc cctttgtggtcggtatacag gtggtgcatg gctgtcgtca gctcgtgtcg 1020 tgagatgttg ggttaagtcccgcaacgagc gcaaccccta tcttatgttg ccagcacgtt 1080 atggtgggga ctcgtaagagactgccgggg tcaactcgga ggaaggtggg gacgacgtca 1140 agtcatcatg ccccttatgtccagggcttc acacatgcta caatggccag tacagagggc 1200 tgcgagaccg tgaggtggagcgaatccctt aaagctggtc tcagttcgga tcggggtctg 1260 caactcgacc ccgtgaagtcggagtcgcta gtaatcgcag atcagcaacg ctgcggtgaa 1320 tacgttcccg ggccttgtacacaccgcccg tcacgtcatg aaagtcggta acacccgaag 1380 ccggtggctt aaccccttgtgcgaggagcc gtcgaangtg ggatcggcga ttgggcgcc 1439 7 1626 DNA Rhodococcussp. phi1 7 atgactgcac agatctcacc cacagttgtc gacgccgttg tcatcggcgccggattcggc 60 ggcatctacg ccgtgcacaa gctgcacaac gaacagggcc tgaccgtggtcggtttcgac 120 aaggcggacg gccccggcgg tacctggtac tggaaccgct acccgggagcgctctccgac 180 accgagagtc atctctaccg cttctcgttc gaccgcgacc tgctgcaggacggcacgtgg 240 aagaccacgt acatcaccca gcccgagatc ctcgagtatc tcgagagcgtcgtcgaccgg 300 ttcgacctgc gtcgtcactt ccggttcggc accgaggtca cctcggcgatctacctcgag 360 gacgagaacc tgtgggaggt ctccaccgac aagggtgagg tctaccgggccaagtacgtc 420 gtcaacgccg tgggcctgct ctccgccatc aacttccccg acctccccggcctcgacacc 480 ttcgagggcg agaccatcca caccgccgcc tggcccgagg gcaagaacctcgccggcaag 540 cgtgtcggtg tcatcggtac cggatcgacc gggcagcagg tcatcaccgccctcgcgccg 600 gaggtcgagc acctcaccgt cttcgtccgc accccgcagt actccgtgccggtcggcaac 660 cgtcccgtga cgaaggaaca gatcgacgcg atcaaggccg actacgacggtatctgggac 720 agcgtcaaga agtccgcggt ggccttcggg ttcgaggagt ccaccctgcctgccatgtcc 780 gtctcggaag aggagcgcaa ccgcatcttc caggaggcgt gggaccacggcggcggcttc 840 cgcttcatgt tcggcacctt cggcgacatc gccaccgacg aggccgccaacgaagctgcg 900 gcatcgttca tccgctccaa gatcgccgag atcatcgagg atccggaaacggcccgcaag 960 ctgatgccga ccggtctgta cgccaagcgt ccgctgtgcg acaacggctactacgaggtg 1020 tacaaccgcc cgaacgtcga ggccgtcgcg atcaaggaga accccatccgtgaggtcacc 1080 gccaagggcg tcgtgaccga ggacggtgtc ctccacgaac tcgacgtgctcgtcttcgcc 1140 accggcttcg acgccgtcga cggcaactac cgccggatcg agatccgcggccggaacggc 1200 ctgcacatca acgaccactg ggacggccag ccgacgagct acctcggcgtcaccaccgcg 1260 aacttcccca actggttcat ggtgctcggt cccaacggcc cgttcacaaacctgccgccg 1320 agcatcgaaa cgcaggtcga gtggatcagc gacaccgtcg cctacgccgagcgcaacgag 1380 atccgtgcga tcgaacccac cccggaggcc gaggaggagt ggacgcagacctgcaccgac 1440 atcgcgaacg ccacgctgtt cacccgcggt gactcctgga tcttcggcgcgaatgttccg 1500 ggcaagaagc cgagcgtcct gttctacctg ggcggactgg gcaactaccgcaacgtcctc 1560 gcgggtgtcg tcgccgacag ctaccgaggt ttcgagttga agtccgctgtcccggtgacc 1620 gcctga 1626 8 542 PRT Rhodococcus sp. phi1 8 Met Thr AlaGln Ile Ser Pro Thr Val Val Asp Ala Val Val Ile Gly 1 5 10 15 Ala GlyPhe Gly Gly Ile Tyr Ala Val His Lys Leu His Asn Glu Gln 20 25 30 Gly LeuThr Val Val Gly Phe Asp Lys Ala Asp Gly Pro Gly Gly Thr 35 40 45 Trp TyrTrp Asn Arg Tyr Pro Gly Ala Leu Ser Asp Thr Glu Ser His 50 55 60 Leu TyrArg Phe Ser Phe Asp Arg Asp Leu Leu Gln Asp Gly Thr Trp 65 70 75 80 LysThr Thr Tyr Ile Thr Gln Pro Glu Ile Leu Glu Tyr Leu Glu Ser 85 90 95 ValVal Asp Arg Phe Asp Leu Arg Arg His Phe Arg Phe Gly Thr Glu 100 105 110Val Thr Ser Ala Ile Tyr Leu Glu Asp Glu Asn Leu Trp Glu Val Ser 115 120125 Thr Asp Lys Gly Glu Val Tyr Arg Ala Lys Tyr Val Val Asn Ala Val 130135 140 Gly Leu Leu Ser Ala Ile Asn Phe Pro Asp Leu Pro Gly Leu Asp Thr145 150 155 160 Phe Glu Gly Glu Thr Ile His Thr Ala Ala Trp Pro Glu GlyLys Asn 165 170 175 Leu Ala Gly Lys Arg Val Gly Val Ile Gly Thr Gly SerThr Gly Gln 180 185 190 Gln Val Ile Thr Ala Leu Ala Pro Glu Val Glu HisLeu Thr Val Phe 195 200 205 Val Arg Thr Pro Gln Tyr Ser Val Pro Val GlyAsn Arg Pro Val Thr 210 215 220 Lys Glu Gln Ile Asp Ala Ile Lys Ala AspTyr Asp Gly Ile Trp Asp 225 230 235 240 Ser Val Lys Lys Ser Ala Val AlaPhe Gly Phe Glu Glu Ser Thr Leu 245 250 255 Pro Ala Met Ser Val Ser GluGlu Glu Arg Asn Arg Ile Phe Gln Glu 260 265 270 Ala Trp Asp His Gly GlyGly Phe Arg Phe Met Phe Gly Thr Phe Gly 275 280 285 Asp Ile Ala Thr AspGlu Ala Ala Asn Glu Ala Ala Ala Ser Phe Ile 290 295 300 Arg Ser Lys IleAla Glu Ile Ile Glu Asp Pro Glu Thr Ala Arg Lys 305 310 315 320 Leu MetPro Thr Gly Leu Tyr Ala Lys Arg Pro Leu Cys Asp Asn Gly 325 330 335 TyrTyr Glu Val Tyr Asn Arg Pro Asn Val Glu Ala Val Ala Ile Lys 340 345 350Glu Asn Pro Ile Arg Glu Val Thr Ala Lys Gly Val Val Thr Glu Asp 355 360365 Gly Val Leu His Glu Leu Asp Val Leu Val Phe Ala Thr Gly Phe Asp 370375 380 Ala Val Asp Gly Asn Tyr Arg Arg Ile Glu Ile Arg Gly Arg Asn Gly385 390 395 400 Leu His Ile Asn Asp His Trp Asp Gly Gln Pro Thr Ser TyrLeu Gly 405 410 415 Val Thr Thr Ala Asn Phe Pro Asn Trp Phe Met Val LeuGly Pro Asn 420 425 430 Gly Pro Phe Thr Asn Leu Pro Pro Ser Ile Glu ThrGln Val Glu Trp 435 440 445 Ile Ser Asp Thr Val Ala Tyr Ala Glu Arg AsnGlu Ile Arg Ala Ile 450 455 460 Glu Pro Thr Pro Glu Ala Glu Glu Glu TrpThr Gln Thr Cys Thr Asp 465 470 475 480 Ile Ala Asn Ala Thr Leu Phe ThrArg Gly Asp Ser Trp Ile Phe Gly 485 490 495 Ala Asn Val Pro Gly Lys LysPro Ser Val Leu Phe Tyr Leu Gly Gly 500 505 510 Leu Gly Asn Tyr Arg AsnVal Leu Ala Gly Val Val Ala Asp Ser Tyr 515 520 525 Arg Gly Phe Glu LeuLys Ser Ala Val Pro Val Thr Ala Glx 530 535 540 9 1623 DNA Rhodococcussp. phi2 9 atgaccgcac agaccatcca caccgtcgac gccgtcgtca tcggcgccggattcggcggc 60 atctacgccg tccacaagct gcaccacgaa ctcggcctga ccaccgtcggattcgacaag 120 gcagacggcc ccggcggcac ctggtactgg aaccgctacc cgggcgccctctccgacacg 180 gagagccacc tctaccgctt ctccttcgac cgcgacctgc tgcaggacggcacctggaag 240 aacacgtacg tcacccagcc cgagatcctg gagtatctcg aggacgtcgtcgaccgcttc 300 gacctgcgcc gccacttccg gttcggcacc gaggtcacct cggcgatctatctcgacgac 360 gagaacctct gggaggtcac caccgacggc ggcgacgtct atcgggcgacctacgtcgtc 420 aacgccgtcg ggctgctctc cgccatcaac ttcccgaacc tgcccggcctggacacgttc 480 gagggcgaga ccatccacac cgccgcctgg ccggagggca agagcctcgccgggcgccgc 540 gtcggcgtca tcggtaccgg ttccaccggc cagcaggtca tcacggcgctggcgccggag 600 gtcgagcacc tcaccgtctt cgtccggacc ccgcagtact ccgtaccggtcggcaaccgt 660 cccgtgaccc cggagcagat cgacgcgatc aaggccgact acgaccgaatctgggagcag 720 gccaagaact ccgcggtggc cttcggcttc gaggagtcca ccctgccggccatgtccgtc 780 tcggaggagg agcgcaaccg gatcttccag gaggcctggg accacggcggcggattccgt 840 ttcatgttcg gcaccttcgg tgacatcgcc accgacgagg ccgccaacgaagccgccgcg 900 tcgttcatcc gctccaagat cgccgagatc atcgaggatc cggagaccgcccgcaagctg 960 atgccgaccg gtctgttcgc caagcgcccg ctgtgcgacg ccggctaccaccaggtcttc 1020 aaccggccga acgtggaagc ggttgccatc aaggagaacc ccatccgcgaggtcaccgcg 1080 aagggcgtgg tgaccgagga cggcgtcctg cacgagttgg acgtgctcgtcttcgccacc 1140 ggcttcgacg ccgtggacgg caactaccgg cgcatcgaga tccgcggccgggacggcctg 1200 cacatcaacg accactggga cggccagccg accagctacc tgggcgtctccacggcgaac 1260 ttccccaact ggttcatggt gctgggcccc aacggtccgt tcacgaacctgcccccgagc 1320 atcgagaccc aggtcgagtg gatcagcgac acgatcgggt acgccgagcgcaacggtgtg 1380 cgggccatcg agcccacgcc ggaggccgag gccgaatgga ccgagacctgcaccgcgatc 1440 gcgaacgcca cgctgttcac caagggcgat tcgtggatct tcggcgcgaacatcccgggc 1500 aagacgccga gcgtactgtt ctacctgggc ggcctgcgca actaccgtgccgtcctcgcc 1560 gaggtcgcga ccgacggata ccggggcttc gacgtgaagt ccgccgagatggtcacggtc 1620 tga 1623 10 541 PRT Rhodococcus sp. phi2 10 Met Thr AlaGln Thr Ile His Thr Val Asp Ala Val Val Ile Gly Ala 1 5 10 15 Gly PheGly Gly Ile Tyr Ala Val His Lys Leu His His Glu Leu Gly 20 25 30 Leu ThrThr Val Gly Phe Asp Lys Ala Asp Gly Pro Gly Gly Thr Trp 35 40 45 Tyr TrpAsn Arg Tyr Pro Gly Ala Leu Ser Asp Thr Glu Ser His Leu 50 55 60 Tyr ArgPhe Ser Phe Asp Arg Asp Leu Leu Gln Asp Gly Thr Trp Lys 65 70 75 80 AsnThr Tyr Val Thr Gln Pro Glu Ile Leu Glu Tyr Leu Glu Asp Val 85 90 95 ValAsp Arg Phe Asp Leu Arg Arg His Phe Arg Phe Gly Thr Glu Val 100 105 110Thr Ser Ala Ile Tyr Leu Asp Asp Glu Asn Leu Trp Glu Val Thr Thr 115 120125 Asp Gly Gly Asp Val Tyr Arg Ala Thr Tyr Val Val Asn Ala Val Gly 130135 140 Leu Leu Ser Ala Ile Asn Phe Pro Asn Leu Pro Gly Leu Asp Thr Phe145 150 155 160 Glu Gly Glu Thr Ile His Thr Ala Ala Trp Pro Glu Gly LysSer Leu 165 170 175 Ala Gly Arg Arg Val Gly Val Ile Gly Thr Gly Ser ThrGly Gln Gln 180 185 190 Val Ile Thr Ala Leu Ala Pro Glu Val Glu His LeuThr Val Phe Val 195 200 205 Arg Thr Pro Gln Tyr Ser Val Pro Val Gly AsnArg Pro Val Thr Pro 210 215 220 Glu Gln Ile Asp Ala Ile Lys Ala Asp TyrAsp Arg Ile Trp Glu Gln 225 230 235 240 Ala Lys Asn Ser Ala Val Ala PheGly Phe Glu Glu Ser Thr Leu Pro 245 250 255 Ala Met Ser Val Ser Glu GluGlu Arg Asn Arg Ile Phe Gln Glu Ala 260 265 270 Trp Asp His Gly Gly GlyPhe Arg Phe Met Phe Gly Thr Phe Gly Asp 275 280 285 Ile Ala Thr Asp GluAla Ala Asn Glu Ala Ala Ala Ser Phe Ile Arg 290 295 300 Ser Lys Ile AlaGlu Ile Ile Glu Asp Pro Glu Thr Ala Arg Lys Leu 305 310 315 320 Met ProThr Gly Leu Phe Ala Lys Arg Pro Leu Cys Asp Ala Gly Tyr 325 330 335 HisGln Val Phe Asn Arg Pro Asn Val Glu Ala Val Ala Ile Lys Glu 340 345 350Asn Pro Ile Arg Glu Val Thr Ala Lys Gly Val Val Thr Glu Asp Gly 355 360365 Val Leu His Glu Leu Asp Val Leu Val Phe Ala Thr Gly Phe Asp Ala 370375 380 Val Asp Gly Asn Tyr Arg Arg Ile Glu Ile Arg Gly Arg Asp Gly Leu385 390 395 400 His Ile Asn Asp His Trp Asp Gly Gln Pro Thr Ser Tyr LeuGly Val 405 410 415 Ser Thr Ala Asn Phe Pro Asn Trp Phe Met Val Leu GlyPro Asn Gly 420 425 430 Pro Phe Thr Asn Leu Pro Pro Ser Ile Glu Thr GlnVal Glu Trp Ile 435 440 445 Ser Asp Thr Ile Gly Tyr Ala Glu Arg Asn GlyVal Arg Ala Ile Glu 450 455 460 Pro Thr Pro Glu Ala Glu Ala Glu Trp ThrGlu Thr Cys Thr Ala Ile 465 470 475 480 Ala Asn Ala Thr Leu Phe Thr LysGly Asp Ser Trp Ile Phe Gly Ala 485 490 495 Asn Ile Pro Gly Lys Thr ProSer Val Leu Phe Tyr Leu Gly Gly Leu 500 505 510 Arg Asn Tyr Arg Ala ValLeu Ala Glu Val Ala Thr Asp Gly Tyr Arg 515 520 525 Gly Phe Asp Val LysSer Ala Glu Met Val Thr Val Glx 530 535 540 11 1596 DNA Arthrobacter sp.BP2 11 atgactgcac agaacacttt ccagaccgtt gacgccgtcg tcatcggcgc cggcttcggc60 ggcatctacg ccgtccacaa gcttcacaac gagcagggtc tgaccgttgt cggcttcgac 120aaggccgacg gtcccggcgg cacctggtac tggaaccgct acccgggcgc tctctctgac 180accgagagcc acgtctaccg cttctctttc gataagggcc tcctgcagga cggcacctgg 240aagcacacct acatcaccca gcccgagatc ctcgagtacc ttgaggacgt cgttgaccgc 300tttgacctgc ggcgccactt ccgctttggt accgaggtca agtccgccac ctacctcgaa 360gacgagggcc tgtgggaagt gaccaccggc ggcggcgcgg tgtaccgggc taagtacgtc 420atcaacgccg tggggctgct gtcagccatc aacttcccga acctgcccgg gatcgacacc 480tttgagggcg agaccatcca caccgccgcc tggccgcagg gcaagtccct cgccggtcgc 540cgcgtgggtg tgatcggcac cggttccacc ggccagcagg tcatcacggc gctggcaccg 600gaagttgaac acctgaccgt cttcgtcagg accccgcagt actccgtccc ggtgggcaag 660cgccccgtga ccacccagca gattgacgag atcaaggccg actacgacaa catctgggca 720caggtcaagc gttccggcgt agccttcggc ttcgaggaaa gcaccgtgcc ggccatgagc 780gtcaccgaag aagaacgccg ccaggtctac gagaaggcct gggaatacgg cggcggcttc 840cgcttcatgt tcgaaacctt cagcgacatc gccaccgacg aggaggccaa cgagactgcg 900gcatccttca tccggaacaa gatcgtcgag accatcaagg atccggagac ggcacggaaa 960ctgacgccga cgggcttgtt cgcccgtcgc ccgctctgcg acgacggctt acttccaggt 1020gttcaaccgg cccaacgtcg aggctgtcgc tatcaaggaa aaccccattc gggaagtcac 1080ggccaagggt gtggtgacgg aggacggcgt gctgcacgag ctggacgtca tcgtcttcgc 1140gaccggtttc gacgccgtgg acggcaatta ccgccgcatg gagatcagcg ggcgcgacgg 1200cgtgaacatc aacgaccact gggacgggca gcccaccagc tacctgggcg tttccacagc 1260gaagttcccc aactggttca tggtgctggg acccaacggc ccgttcacga acctgccgcc 1320gagcatcgag acgcaggtcg aatggatcag cgacacggtg gcctacgcgg aggaaaacgg 1380aatccgggcg atcgagccga ccccggaggc cgaagccgag tggaccgaga cgtgcacaca 1440gatcgcgaac atgacggtgt tcaccaaggt cgattcatgg atcttcggcg cgaacgttcc 1500gggcaagaag cccagcgtgc tgttctatct gggcggcctg ggcaactacc gcggcgtcct 1560ggacgatgtc accgacaacg gataccgcgg ctttga 1596 12 532 PRT Arthrobacter sp.BP2 12 Met Thr Ala Gln Asn Thr Phe Gln Thr Val Asp Ala Val Val Ile Gly 15 10 15 Ala Gly Phe Gly Gly Ile Tyr Ala Val His Lys Leu His Asn Glu Gln20 25 30 Gly Leu Thr Val Val Gly Phe Asp Lys Ala Asp Gly Pro Gly Gly Thr35 40 45 Trp Tyr Trp Asn Arg Tyr Pro Gly Ala Leu Ser Asp Thr Glu Ser His50 55 60 Val Tyr Arg Phe Ser Phe Asp Lys Gly Leu Leu Gln Asp Gly Thr Trp65 70 75 80 Lys His Thr Tyr Ile Thr Gln Pro Glu Ile Leu Glu Tyr Leu GluAsp 85 90 95 Val Val Asp Arg Phe Asp Leu Arg Arg His Phe Arg Phe Gly ThrGlu 100 105 110 Val Lys Ser Ala Thr Tyr Leu Glu Asp Glu Gly Leu Trp GluVal Thr 115 120 125 Thr Gly Gly Gly Ala Val Tyr Arg Ala Lys Tyr Val IleAsn Ala Val 130 135 140 Gly Leu Leu Ser Ala Ile Asn Phe Pro Asn Leu ProGly Ile Asp Thr 145 150 155 160 Phe Glu Gly Glu Thr Ile His Thr Ala AlaTrp Pro Gln Gly Lys Ser 165 170 175 Leu Ala Gly Arg Arg Val Gly Val IleGly Thr Gly Ser Thr Gly Gln 180 185 190 Gln Val Ile Thr Ala Leu Ala ProGlu Val Glu His Leu Thr Val Phe 195 200 205 Val Arg Thr Pro Gln Tyr SerVal Pro Val Gly Lys Arg Pro Val Thr 210 215 220 Thr Gln Gln Ile Asp GluIle Lys Ala Asp Tyr Asp Asn Ile Trp Ala 225 230 235 240 Gln Val Lys ArgSer Gly Val Ala Phe Gly Phe Glu Glu Ser Thr Val 245 250 255 Pro Ala MetSer Val Thr Glu Glu Glu Arg Arg Gln Val Tyr Glu Lys 260 265 270 Ala TrpGlu Tyr Gly Gly Gly Phe Arg Phe Met Phe Glu Thr Phe Ser 275 280 285 AspIle Ala Thr Asp Glu Glu Ala Asn Glu Thr Ala Ala Ser Phe Ile 290 295 300Arg Asn Lys Ile Val Glu Thr Ile Lys Asp Pro Glu Thr Ala Arg Lys 305 310315 320 Leu Thr Pro Thr Gly Leu Phe Ala Arg Arg Pro Leu Cys Asp Asp Gly325 330 335 Leu Leu Pro Gly Val Gln Pro Ala Gln Arg Arg Gly Cys Arg TyrGln 340 345 350 Gly Lys Pro His Ser Gly Ser His Gly Gln Gly Cys Gly AspGly Gly 355 360 365 Arg Arg Ala Ala Arg Ala Gly Arg His Arg Leu Arg AspArg Phe Arg 370 375 380 Arg Arg Gly Arg Gln Leu Pro Pro His Gly Asp GlnArg Ala Arg Arg 385 390 395 400 Arg Glu His Gln Arg Pro Leu Gly Arg AlaAla His Gln Leu Pro Gly 405 410 415 Arg Phe His Ser Glu Val Pro Gln LeuVal His Gly Ala Gly Thr Gln 420 425 430 Arg Pro Val His Glu Pro Ala AlaGlu His Arg Asp Ala Gly Arg Met 435 440 445 Asp Gln Arg His Gly Gly LeuArg Gly Gly Lys Arg Asn Pro Gly Asp 450 455 460 Arg Ala Asp Pro Gly GlyArg Ser Arg Val Asp Arg Asp Val His Thr 465 470 475 480 Asp Arg Glu HisAsp Gly Val His Gln Gly Arg Phe Met Asp Leu Arg 485 490 495 Arg Glu ArgSer Gly Gln Glu Ala Gln Arg Ala Val Leu Ser Gly Arg 500 505 510 Pro GlyGln Leu Pro Arg Arg Pro Gly Arg Cys His Arg Gln Arg Ile 515 520 525 ProArg Leu Glx 530 13 1662 DNA Brevibacterium sp. HCU CDS (1)..(1662) 13atg cca att aca caa caa ctt gac cac gac gct atc gtc atc ggc gcc 48 MetPro Ile Thr Gln Gln Leu Asp His Asp Ala Ile Val Ile Gly Ala 1 5 10 15ggc ttc tcc gga cta gcc att ctg cac cac ctg cgt gaa atc ggc cta 96 GlyPhe Ser Gly Leu Ala Ile Leu His His Leu Arg Glu Ile Gly Leu 20 25 30 gacact caa atc gtc gaa gca acc gac ggc att gga gga act tgg tgg 144 Asp ThrGln Ile Val Glu Ala Thr Asp Gly Ile Gly Gly Thr Trp Trp 35 40 45 atc aaccgc tac ccg ggg gtg cgg acc gac agc gag ttc cac tac tac 192 Ile Asn ArgTyr Pro Gly Val Arg Thr Asp Ser Glu Phe His Tyr Tyr 50 55 60 tct ttc agcttc agc aag gaa gtt cgt gac gag tgg aca tgg act caa 240 Ser Phe Ser PheSer Lys Glu Val Arg Asp Glu Trp Thr Trp Thr Gln 65 70 75 80 cgc tac ccagac ggt gaa gaa gtt tgc gcc tat ctc aat ttc att gct 288 Arg Tyr Pro AspGly Glu Glu Val Cys Ala Tyr Leu Asn Phe Ile Ala 85 90 95 gat cga ctt gatctt cgg aag gac att cag ctc aac tca cga gtg aat 336 Asp Arg Leu Asp LeuArg Lys Asp Ile Gln Leu Asn Ser Arg Val Asn 100 105 110 act gcc cgt tggaat gag acg gaa aag tac tgg gac gtc att ttc gaa 384 Thr Ala Arg Trp AsnGlu Thr Glu Lys Tyr Trp Asp Val Ile Phe Glu 115 120 125 gac ggg tcc tcgaaa cgc gct cgc ttc ctc atc agc gca atg ggt gca 432 Asp Gly Ser Ser LysArg Ala Arg Phe Leu Ile Ser Ala Met Gly Ala 130 135 140 ctt agc cag gcgatt ttc ccg gcc atc gac gga atc gac gaa ttc aac 480 Leu Ser Gln Ala IlePhe Pro Ala Ile Asp Gly Ile Asp Glu Phe Asn 145 150 155 160 ggc gcg aaatat cac act gcg gct tgg cca gct gat ggc gta gat ttc 528 Gly Ala Lys TyrHis Thr Ala Ala Trp Pro Ala Asp Gly Val Asp Phe 165 170 175 acg ggc aagaag gtt gga gtc att ggg gtt ggg gcc tcg gga att caa 576 Thr Gly Lys LysVal Gly Val Ile Gly Val Gly Ala Ser Gly Ile Gln 180 185 190 atc att cccgag ctc gcc aag ttg gct ggc gaa cta ttc gta ttc cag 624 Ile Ile Pro GluLeu Ala Lys Leu Ala Gly Glu Leu Phe Val Phe Gln 195 200 205 cga act ccgaac tat gtg gtt gag agc aac aac gac aaa gtt gac gcc 672 Arg Thr Pro AsnTyr Val Val Glu Ser Asn Asn Asp Lys Val Asp Ala 210 215 220 gag tgg atgcag tac gtt cgc gac aac tat gac gaa att ttc gaa cgc 720 Glu Trp Met GlnTyr Val Arg Asp Asn Tyr Asp Glu Ile Phe Glu Arg 225 230 235 240 gca tccaag cac ccg ttc ggg gtc gat atg gag tat ccg acg gat tcc 768 Ala Ser LysHis Pro Phe Gly Val Asp Met Glu Tyr Pro Thr Asp Ser 245 250 255 gcc gtcgag gtt tca gaa gaa gaa cgt aag cga gtc ttt gaa agc aaa 816 Ala Val GluVal Ser Glu Glu Glu Arg Lys Arg Val Phe Glu Ser Lys 260 265 270 tgg gaggag gga ggc ttc cat ttt gca aac gag tgt ttc acg gac ctg 864 Trp Glu GluGly Gly Phe His Phe Ala Asn Glu Cys Phe Thr Asp Leu 275 280 285 ggt accagt cct gag gcc agc gag ctg gcg tca gag ttc ata cgt tcg 912 Gly Thr SerPro Glu Ala Ser Glu Leu Ala Ser Glu Phe Ile Arg Ser 290 295 300 aag attcgg gag gtc gtt aag gac ccc gct acg gca gat ctc ctt tgt 960 Lys Ile ArgGlu Val Val Lys Asp Pro Ala Thr Ala Asp Leu Leu Cys 305 310 315 320 cccaag tcg tac tcg ttc aac ggt aag cga gtg ccg acc ggc cac ggc 1008 Pro LysSer Tyr Ser Phe Asn Gly Lys Arg Val Pro Thr Gly His Gly 325 330 335 tactac gag acg ttc aat cgc acg aat gtg cac ctt ttg gat gcc agg 1056 Tyr TyrGlu Thr Phe Asn Arg Thr Asn Val His Leu Leu Asp Ala Arg 340 345 350 ggcact cca att act cgg atc agc agc aaa ggt atc gtt cac gga gac 1104 Gly ThrPro Ile Thr Arg Ile Ser Ser Lys Gly Ile Val His Gly Asp 355 360 365 accgaa tac gaa cta gat gca atc gtg ttc gca acc ggc ttc gac gcg 1152 Thr GluTyr Glu Leu Asp Ala Ile Val Phe Ala Thr Gly Phe Asp Ala 370 375 380 atgaca ggt acg ctc acc aac att gac atc gtc ggc cgc gac gga gtc 1200 Met ThrGly Thr Leu Thr Asn Ile Asp Ile Val Gly Arg Asp Gly Val 385 390 395 400atc ctc cgc gac aag tgg gcc cag gat ggg ctt agg aca aac att ggt 1248 IleLeu Arg Asp Lys Trp Ala Gln Asp Gly Leu Arg Thr Asn Ile Gly 405 410 415ctt act gta aac ggc ttc ccg aac ttc ctg atg tct ctt gga cct cag 1296 LeuThr Val Asn Gly Phe Pro Asn Phe Leu Met Ser Leu Gly Pro Gln 420 425 430acc ccg tac tcc aac ctt gtt gtt cct att cag ttg gga gcc caa tgg 1344 ThrPro Tyr Ser Asn Leu Val Val Pro Ile Gln Leu Gly Ala Gln Trp 435 440 445atg cag cga ttc ctt aag ttc att cag gaa cgc ggc att gaa gtg ttc 1392 MetGln Arg Phe Leu Lys Phe Ile Gln Glu Arg Gly Ile Glu Val Phe 450 455 460gag tcg tcg aga gaa gct gaa gaa atc tgg aat gcc gaa acc att cgc 1440 GluSer Ser Arg Glu Ala Glu Glu Ile Trp Asn Ala Glu Thr Ile Arg 465 470 475480 ggc gct gaa tct acg gtc atg tcc atc gaa gga ccc aaa gcc ggc gca 1488Gly Ala Glu Ser Thr Val Met Ser Ile Glu Gly Pro Lys Ala Gly Ala 485 490495 tgg ttc atc ggc ggc aac att ccc ggt aaa tca cgt gag tac cag gtg 1536Trp Phe Ile Gly Gly Asn Ile Pro Gly Lys Ser Arg Glu Tyr Gln Val 500 505510 tat atg ggc ggc ggt cag gtc tac cag gac tgg tgc cgc gag gcg gaa 1584Tyr Met Gly Gly Gly Gln Val Tyr Gln Asp Trp Cys Arg Glu Ala Glu 515 520525 gaa tcc gac tac gcc act ttt ctg aat gct gac tcc att gac ggc gaa 1632Glu Ser Asp Tyr Ala Thr Phe Leu Asn Ala Asp Ser Ile Asp Gly Glu 530 535540 aag gtt cgt gaa tcg gcg ggt atg aaa tag 1662 Lys Val Arg Glu Ser AlaGly Met Lys 545 550 14 553 PRT Brevibacterium sp. HCU 14 Met Pro Ile ThrGln Gln Leu Asp His Asp Ala Ile Val Ile Gly Ala 1 5 10 15 Gly Phe SerGly Leu Ala Ile Leu His His Leu Arg Glu Ile Gly Leu 20 25 30 Asp Thr GlnIle Val Glu Ala Thr Asp Gly Ile Gly Gly Thr Trp Trp 35 40 45 Ile Asn ArgTyr Pro Gly Val Arg Thr Asp Ser Glu Phe His Tyr Tyr 50 55 60 Ser Phe SerPhe Ser Lys Glu Val Arg Asp Glu Trp Thr Trp Thr Gln 65 70 75 80 Arg TyrPro Asp Gly Glu Glu Val Cys Ala Tyr Leu Asn Phe Ile Ala 85 90 95 Asp ArgLeu Asp Leu Arg Lys Asp Ile Gln Leu Asn Ser Arg Val Asn 100 105 110 ThrAla Arg Trp Asn Glu Thr Glu Lys Tyr Trp Asp Val Ile Phe Glu 115 120 125Asp Gly Ser Ser Lys Arg Ala Arg Phe Leu Ile Ser Ala Met Gly Ala 130 135140 Leu Ser Gln Ala Ile Phe Pro Ala Ile Asp Gly Ile Asp Glu Phe Asn 145150 155 160 Gly Ala Lys Tyr His Thr Ala Ala Trp Pro Ala Asp Gly Val AspPhe 165 170 175 Thr Gly Lys Lys Val Gly Val Ile Gly Val Gly Ala Ser GlyIle Gln 180 185 190 Ile Ile Pro Glu Leu Ala Lys Leu Ala Gly Glu Leu PheVal Phe Gln 195 200 205 Arg Thr Pro Asn Tyr Val Val Glu Ser Asn Asn AspLys Val Asp Ala 210 215 220 Glu Trp Met Gln Tyr Val Arg Asp Asn Tyr AspGlu Ile Phe Glu Arg 225 230 235 240 Ala Ser Lys His Pro Phe Gly Val AspMet Glu Tyr Pro Thr Asp Ser 245 250 255 Ala Val Glu Val Ser Glu Glu GluArg Lys Arg Val Phe Glu Ser Lys 260 265 270 Trp Glu Glu Gly Gly Phe HisPhe Ala Asn Glu Cys Phe Thr Asp Leu 275 280 285 Gly Thr Ser Pro Glu AlaSer Glu Leu Ala Ser Glu Phe Ile Arg Ser 290 295 300 Lys Ile Arg Glu ValVal Lys Asp Pro Ala Thr Ala Asp Leu Leu Cys 305 310 315 320 Pro Lys SerTyr Ser Phe Asn Gly Lys Arg Val Pro Thr Gly His Gly 325 330 335 Tyr TyrGlu Thr Phe Asn Arg Thr Asn Val His Leu Leu Asp Ala Arg 340 345 350 GlyThr Pro Ile Thr Arg Ile Ser Ser Lys Gly Ile Val His Gly Asp 355 360 365Thr Glu Tyr Glu Leu Asp Ala Ile Val Phe Ala Thr Gly Phe Asp Ala 370 375380 Met Thr Gly Thr Leu Thr Asn Ile Asp Ile Val Gly Arg Asp Gly Val 385390 395 400 Ile Leu Arg Asp Lys Trp Ala Gln Asp Gly Leu Arg Thr Asn IleGly 405 410 415 Leu Thr Val Asn Gly Phe Pro Asn Phe Leu Met Ser Leu GlyPro Gln 420 425 430 Thr Pro Tyr Ser Asn Leu Val Val Pro Ile Gln Leu GlyAla Gln Trp 435 440 445 Met Gln Arg Phe Leu Lys Phe Ile Gln Glu Arg GlyIle Glu Val Phe 450 455 460 Glu Ser Ser Arg Glu Ala Glu Glu Ile Trp AsnAla Glu Thr Ile Arg 465 470 475 480 Gly Ala Glu Ser Thr Val Met Ser IleGlu Gly Pro Lys Ala Gly Ala 485 490 495 Trp Phe Ile Gly Gly Asn Ile ProGly Lys Ser Arg Glu Tyr Gln Val 500 505 510 Tyr Met Gly Gly Gly Gln ValTyr Gln Asp Trp Cys Arg Glu Ala Glu 515 520 525 Glu Ser Asp Tyr Ala ThrPhe Leu Asn Ala Asp Ser Ile Asp Gly Glu 530 535 540 Lys Val Arg Glu SerAla Gly Met Lys 545 550 15 1590 DNA Brevibacterium sp. HCU CDS(1)..(1590) 15 atg acg tca acc atg cct gca ccg aca gca gca cag gcg aacgca gac 48 Met Thr Ser Thr Met Pro Ala Pro Thr Ala Ala Gln Ala Asn AlaAsp 1 5 10 15 gag acc gag gtc ctc gac gca ctc atc gtg ggt ggc gga ttctcg ggg 96 Glu Thr Glu Val Leu Asp Ala Leu Ile Val Gly Gly Gly Phe SerGly 20 25 30 cct gta tct gtc gac cgc ctg cgt gaa gac ggg ttc aag gtc aaggtc 144 Pro Val Ser Val Asp Arg Leu Arg Glu Asp Gly Phe Lys Val Lys Val35 40 45 tgg gac gcc gcc ggc gga ttc ggc ggc atc tgg tgg tgg aac tgc tac192 Trp Asp Ala Ala Gly Gly Phe Gly Gly Ile Trp Trp Trp Asn Cys Tyr 5055 60 ccg ggt gct cgt acg gac agc acc gga cag atc tat cag ttc cag tac240 Pro Gly Ala Arg Thr Asp Ser Thr Gly Gln Ile Tyr Gln Phe Gln Tyr 6570 75 80 aag gac ctg tgg aag gac ttc gac ttc aag gag ctc tac ccc gac ttc288 Lys Asp Leu Trp Lys Asp Phe Asp Phe Lys Glu Leu Tyr Pro Asp Phe 8590 95 aac ggg gtt cgg gag tac ttc gag tac gtc gac tcg cag ctc gac ctg336 Asn Gly Val Arg Glu Tyr Phe Glu Tyr Val Asp Ser Gln Leu Asp Leu 100105 110 tcc cgc gac gtc aca ttc aac acc ttt gcg gag tcc tgc aca tgg gac384 Ser Arg Asp Val Thr Phe Asn Thr Phe Ala Glu Ser Cys Thr Trp Asp 115120 125 gac gct gcc aag gag tgg acg gtg cga tcg tcg gaa gga cgt gag cag432 Asp Ala Ala Lys Glu Trp Thr Val Arg Ser Ser Glu Gly Arg Glu Gln 130135 140 cgg gcc cgt gcg gtc atc gtc gcc acc ggc ttc ggt gcg aag ccc ctc480 Arg Ala Arg Ala Val Ile Val Ala Thr Gly Phe Gly Ala Lys Pro Leu 145150 155 160 tac ccg aac atc gag ggc ctc gac agc ttc gaa ggc gag tgc catcac 528 Tyr Pro Asn Ile Glu Gly Leu Asp Ser Phe Glu Gly Glu Cys His His165 170 175 acc gca cgc tgg ccg cag ggt ggc ctc gac atg acg ggc aag cgagtc 576 Thr Ala Arg Trp Pro Gln Gly Gly Leu Asp Met Thr Gly Lys Arg Val180 185 190 gtc gtc atg ggc acc ggt gct tcc ggc atc cag gtc att caa gaagcc 624 Val Val Met Gly Thr Gly Ala Ser Gly Ile Gln Val Ile Gln Glu Ala195 200 205 gcg gcg gtt gcc gaa cac ctc acc gtc ttc cag cgc acc ccg aacctt 672 Ala Ala Val Ala Glu His Leu Thr Val Phe Gln Arg Thr Pro Asn Leu210 215 220 gcc ctg ccg atg cgg cag cag cgg ctg tcg gcc gat gac aac gatcgc 720 Ala Leu Pro Met Arg Gln Gln Arg Leu Ser Ala Asp Asp Asn Asp Arg225 230 235 240 tac cga gag aac atc gaa gat cgt ttc caa atc cgt gac aattcg ttt 768 Tyr Arg Glu Asn Ile Glu Asp Arg Phe Gln Ile Arg Asp Asn SerPhe 245 250 255 gcc gga ttc gac ttc tac ttc atc ccg cag aac gcc gcg gacacc ccc 816 Ala Gly Phe Asp Phe Tyr Phe Ile Pro Gln Asn Ala Ala Asp ThrPro 260 265 270 gag gac gag cgg acc gcg atc tac gaa aag atg tgg gac gaaggc gga 864 Glu Asp Glu Arg Thr Ala Ile Tyr Glu Lys Met Trp Asp Glu GlyGly 275 280 285 ttc cca ctg tgg ctc gga aac ttc cag gga ctc ctc acc gatgag gca 912 Phe Pro Leu Trp Leu Gly Asn Phe Gln Gly Leu Leu Thr Asp GluAla 290 295 300 gcc aac cac acc ttc tac aac ttc tgg cgt tcg aag gtg cacgat cgt 960 Ala Asn His Thr Phe Tyr Asn Phe Trp Arg Ser Lys Val His AspArg 305 310 315 320 gtg aag gat ccc aag acc gcc gag atg ctc gca ccg gcgacc cca ccg 1008 Val Lys Asp Pro Lys Thr Ala Glu Met Leu Ala Pro Ala ThrPro Pro 325 330 335 cac ccg ttc ggc gtc aag cgt ccc tcg ctc gaa cag aactac ttc gac 1056 His Pro Phe Gly Val Lys Arg Pro Ser Leu Glu Gln Asn TyrPhe Asp 340 345 350 gta tac aac cag gac aat gtc gat ctc atc gac tcg aatgcc acc ccg 1104 Val Tyr Asn Gln Asp Asn Val Asp Leu Ile Asp Ser Asn AlaThr Pro 355 360 365 atc acc cgg gtc ctt ccg aac ggg gtc gaa acc ccg gacgga gtc gtc 1152 Ile Thr Arg Val Leu Pro Asn Gly Val Glu Thr Pro Asp GlyVal Val 370 375 380 gaa tgc gat gtc ctc gtg ctg gcc acc ggc ttc gac aacaac agc ggc 1200 Glu Cys Asp Val Leu Val Leu Ala Thr Gly Phe Asp Asn AsnSer Gly 385 390 395 400 ggc atc aac gcc atc gat atc aaa gcc ggc ggg cagctg ctg cgt gac 1248 Gly Ile Asn Ala Ile Asp Ile Lys Ala Gly Gly Gln LeuLeu Arg Asp 405 410 415 aag tgg gcg acc ggc gtg gac acc tac atg ggg ctgtcg acg cac gga 1296 Lys Trp Ala Thr Gly Val Asp Thr Tyr Met Gly Leu SerThr His Gly 420 425 430 ttc ccc aat ctc atg ttc ctc tac ggc ccg cag agccct tcg ggc ttc 1344 Phe Pro Asn Leu Met Phe Leu Tyr Gly Pro Gln Ser ProSer Gly Phe 435 440 445 tgc aat ggg acc gac ttc ggc gga gcg cca ggc gatatg gtc gcc gac 1392 Cys Asn Gly Thr Asp Phe Gly Gly Ala Pro Gly Asp MetVal Ala Asp 450 455 460 ttc ctc atc tgg ctc aag gac aac ggc atc tcg cggttc gaa tcc acc 1440 Phe Leu Ile Trp Leu Lys Asp Asn Gly Ile Ser Arg PheGlu Ser Thr 465 470 475 480 gaa gag gtc gag cgg gaa tgg cgc gcc cat gtcgac gac atc ttc gtc 1488 Glu Glu Val Glu Arg Glu Trp Arg Ala His Val AspAsp Ile Phe Val 485 490 495 aac tcg ctg ttc ccc aag gcg aag tcc tgg tactgg ggc gcc aac gtc 1536 Asn Ser Leu Phe Pro Lys Ala Lys Ser Trp Tyr TrpGly Ala Asn Val 500 505 510 ccc ggc aag ccg gcg cag atg ctc aac tat tcggag gcg tcc ccg cat 1584 Pro Gly Lys Pro Ala Gln Met Leu Asn Tyr Ser GluAla Ser Pro His 515 520 525 atc tag 1590 Ile 16 529 PRT Brevibacteriumsp. HCU 16 Met Thr Ser Thr Met Pro Ala Pro Thr Ala Ala Gln Ala Asn AlaAsp 1 5 10 15 Glu Thr Glu Val Leu Asp Ala Leu Ile Val Gly Gly Gly PheSer Gly 20 25 30 Pro Val Ser Val Asp Arg Leu Arg Glu Asp Gly Phe Lys ValLys Val 35 40 45 Trp Asp Ala Ala Gly Gly Phe Gly Gly Ile Trp Trp Trp AsnCys Tyr 50 55 60 Pro Gly Ala Arg Thr Asp Ser Thr Gly Gln Ile Tyr Gln PheGln Tyr 65 70 75 80 Lys Asp Leu Trp Lys Asp Phe Asp Phe Lys Glu Leu TyrPro Asp Phe 85 90 95 Asn Gly Val Arg Glu Tyr Phe Glu Tyr Val Asp Ser GlnLeu Asp Leu 100 105 110 Ser Arg Asp Val Thr Phe Asn Thr Phe Ala Glu SerCys Thr Trp Asp 115 120 125 Asp Ala Ala Lys Glu Trp Thr Val Arg Ser SerGlu Gly Arg Glu Gln 130 135 140 Arg Ala Arg Ala Val Ile Val Ala Thr GlyPhe Gly Ala Lys Pro Leu 145 150 155 160 Tyr Pro Asn Ile Glu Gly Leu AspSer Phe Glu Gly Glu Cys His His 165 170 175 Thr Ala Arg Trp Pro Gln GlyGly Leu Asp Met Thr Gly Lys Arg Val 180 185 190 Val Val Met Gly Thr GlyAla Ser Gly Ile Gln Val Ile Gln Glu Ala 195 200 205 Ala Ala Val Ala GluHis Leu Thr Val Phe Gln Arg Thr Pro Asn Leu 210 215 220 Ala Leu Pro MetArg Gln Gln Arg Leu Ser Ala Asp Asp Asn Asp Arg 225 230 235 240 Tyr ArgGlu Asn Ile Glu Asp Arg Phe Gln Ile Arg Asp Asn Ser Phe 245 250 255 AlaGly Phe Asp Phe Tyr Phe Ile Pro Gln Asn Ala Ala Asp Thr Pro 260 265 270Glu Asp Glu Arg Thr Ala Ile Tyr Glu Lys Met Trp Asp Glu Gly Gly 275 280285 Phe Pro Leu Trp Leu Gly Asn Phe Gln Gly Leu Leu Thr Asp Glu Ala 290295 300 Ala Asn His Thr Phe Tyr Asn Phe Trp Arg Ser Lys Val His Asp Arg305 310 315 320 Val Lys Asp Pro Lys Thr Ala Glu Met Leu Ala Pro Ala ThrPro Pro 325 330 335 His Pro Phe Gly Val Lys Arg Pro Ser Leu Glu Gln AsnTyr Phe Asp 340 345 350 Val Tyr Asn Gln Asp Asn Val Asp Leu Ile Asp SerAsn Ala Thr Pro 355 360 365 Ile Thr Arg Val Leu Pro Asn Gly Val Glu ThrPro Asp Gly Val Val 370 375 380 Glu Cys Asp Val Leu Val Leu Ala Thr GlyPhe Asp Asn Asn Ser Gly 385 390 395 400 Gly Ile Asn Ala Ile Asp Ile LysAla Gly Gly Gln Leu Leu Arg Asp 405 410 415 Lys Trp Ala Thr Gly Val AspThr Tyr Met Gly Leu Ser Thr His Gly 420 425 430 Phe Pro Asn Leu Met PheLeu Tyr Gly Pro Gln Ser Pro Ser Gly Phe 435 440 445 Cys Asn Gly Thr AspPhe Gly Gly Ala Pro Gly Asp Met Val Ala Asp 450 455 460 Phe Leu Ile TrpLeu Lys Asp Asn Gly Ile Ser Arg Phe Glu Ser Thr 465 470 475 480 Glu GluVal Glu Arg Glu Trp Arg Ala His Val Asp Asp Ile Phe Val 485 490 495 AsnSer Leu Phe Pro Lys Ala Lys Ser Trp Tyr Trp Gly Ala Asn Val 500 505 510Pro Gly Lys Pro Ala Gln Met Leu Asn Tyr Ser Glu Ala Ser Pro His 515 520525 Ile 17 1614 DNA Brachymonas sp. CHX 17 atgtcttcct cgccaagcagcgccattcat ttcgatgcca tcgttgtggg cgccggattt 60 ggcggcatgt atatgctgcacaaactgcgc gaccagctcg gactcaaggt caaggttttc 120 gacacagccg gcggcatcggcggcacctgg tattggaatc gctatcctgg agccttgtcc 180 gacacgcaca gtcatgtctatcagtattct ttcgacgaag cgatgctcca agaatggaca 240 tggaagaaca aatacctcacgcagccagaa atactggctt atctggagta tgtagcagac 300 cggctcgatc tgcgcccggacattcagttg aacacgaccg tgacatcgat gcatttcaat 360 gaagtccaca acatctgggaagtgcgcacg gaccggggcg ggtactacac cgcgcgcttt 420 atcgtgacgg cactgggtttgttatccgcg atcaactggc ccaacattcc gggccgcgaa 480 agcttccaag gcgagatgtatcacacagcc gcctggccaa aagatgtcga actgcgcggc 540 aaacgcgtcg gcgtgatcggcaccggctcg acgggtgtgc agctgattac cgccatcgct 600 ccagaggtca aacacctgacggtcttccag cgtacaccgc aatacagcgt gccgacggga 660 aatcgtcctg tctccgcgcaagaaatcgca gaagtcaagc gaaacttcag caaggtatgg 720 caacaagtac gtgaatccgccgtcgcattc ggcttcgagg aaagcacagt gcccgcgatg 780 agcgtctccg aagccgaacgccagcgcgtc tttcaggaag cctggaacca aggcaacggc 840 ttttactaca tgttcggcacattttgcgac atcgccaccg acccgcaggc caacgaagcc 900 gcagccacct tcatacgcaacaaaatcgcc gagatcgtca aagacccgga aaccgcccgc 960 aagctcacgc ctacggatgtttacgcccga cgcccgcttt gcgacagtgg ctactatcgc 1020 acctacaacc gcagcaacgtctcactggtg gatgtgaagg cgacaccaat cagtgcgatg 1080 acgccccggg gcattcgcaccgccgacggt gtcgagcacg agttggatat gttgatcctt 1140 gccactggct atgacgccgtcgatggcaat taccgccgca tcgacctgcg cggccgtggc 1200 ggccaaacca tcaatgagcactggaacgac actcctacca gttatgtagg ggtcagcacc 1260 gccaacttcc ccaacatgttcatgatcctg ggcccgaatg gcccattcac gaacctgccg 1320 ccgtcgatcg aagcacaggtcgaatggatc accgacctgg ttgcccacat gcgccagcac 1380 gggctcgcga cggccgaaccaacgcgcgat gctgaagatg cctggggccg cacctgcgcg 1440 gaaatcgccg agcagacgctttttggccag gttgaatcat ggatcttcgg tgccaacagc 1500 cccgggaaga aacatactttgatgttctat ctggccggcc tggggaacta ccgcaagcag 1560 ctcgccgacg tagcgaacgcgcaataccaa ggctttgcgt tccaaccact gtaa 1614 18 538 PRT Brachymonas sp.CHX 18 Met Ser Ser Ser Pro Ser Ser Ala Ile His Phe Asp Ala Ile Val Val 15 10 15 Gly Ala Gly Phe Gly Gly Met Tyr Met Leu His Lys Leu Arg Asp Gln20 25 30 Leu Gly Leu Lys Val Lys Val Phe Asp Thr Ala Gly Gly Ile Gly Gly35 40 45 Thr Trp Tyr Trp Asn Arg Tyr Pro Gly Ala Leu Ser Asp Thr His Ser50 55 60 His Val Tyr Gln Tyr Ser Phe Asp Glu Ala Met Leu Gln Glu Trp Thr65 70 75 80 Trp Lys Asn Lys Tyr Leu Thr Gln Pro Glu Ile Leu Ala Tyr LeuGlu 85 90 95 Tyr Val Ala Asp Arg Leu Asp Leu Arg Pro Asp Ile Gln Leu AsnThr 100 105 110 Thr Val Thr Ser Met His Phe Asn Glu Val His Asn Ile TrpGlu Val 115 120 125 Arg Thr Asp Arg Gly Gly Tyr Tyr Thr Ala Arg Phe IleVal Thr Ala 130 135 140 Leu Gly Leu Leu Ser Ala Ile Asn Trp Pro Asn IlePro Gly Arg Glu 145 150 155 160 Ser Phe Gln Gly Glu Met Tyr His Thr AlaAla Trp Pro Lys Asp Val 165 170 175 Glu Leu Arg Gly Lys Arg Val Gly ValIle Gly Thr Gly Ser Thr Gly 180 185 190 Val Gln Leu Ile Thr Ala Ile AlaPro Glu Val Lys His Leu Thr Val 195 200 205 Phe Gln Arg Thr Pro Gln TyrSer Val Pro Thr Gly Asn Arg Pro Val 210 215 220 Ser Ala Gln Glu Ile AlaGlu Val Lys Arg Asn Phe Ser Lys Val Trp 225 230 235 240 Gln Gln Val ArgGlu Ser Ala Val Ala Phe Gly Phe Glu Glu Ser Thr 245 250 255 Val Pro AlaMet Ser Val Ser Glu Ala Glu Arg Gln Arg Val Phe Gln 260 265 270 Glu AlaTrp Asn Gln Gly Asn Gly Phe Tyr Tyr Met Phe Gly Thr Phe 275 280 285 CysAsp Ile Ala Thr Asp Pro Gln Ala Asn Glu Ala Ala Ala Thr Phe 290 295 300Ile Arg Asn Lys Ile Ala Glu Ile Val Lys Asp Pro Glu Thr Ala Arg 305 310315 320 Lys Leu Thr Pro Thr Asp Val Tyr Ala Arg Arg Pro Leu Cys Asp Ser325 330 335 Gly Tyr Tyr Arg Thr Tyr Asn Arg Ser Asn Val Ser Leu Val AspVal 340 345 350 Lys Ala Thr Pro Ile Ser Ala Met Thr Pro Arg Gly Ile ArgThr Ala 355 360 365 Asp Gly Val Glu His Glu Leu Asp Met Leu Ile Leu AlaThr Gly Tyr 370 375 380 Asp Ala Val Asp Gly Asn Tyr Arg Arg Ile Asp LeuArg Gly Arg Gly 385 390 395 400 Gly Gln Thr Ile Asn Glu His Trp Asn AspThr Pro Thr Ser Tyr Val 405 410 415 Gly Val Ser Thr Ala Asn Phe Pro AsnMet Phe Met Ile Leu Gly Pro 420 425 430 Asn Gly Pro Phe Thr Asn Leu ProPro Ser Ile Glu Ala Gln Val Glu 435 440 445 Trp Ile Thr Asp Leu Val AlaHis Met Arg Gln His Gly Leu Ala Thr 450 455 460 Ala Glu Pro Thr Arg AspAla Glu Asp Ala Trp Gly Arg Thr Cys Ala 465 470 475 480 Glu Ile Ala GluGln Thr Leu Phe Gly Gln Val Glu Ser Trp Ile Phe 485 490 495 Gly Ala AsnSer Pro Gly Lys Lys His Thr Leu Met Phe Tyr Leu Ala 500 505 510 Gly LeuGly Asn Tyr Arg Lys Gln Leu Ala Asp Val Ala Asn Ala Gln 515 520 525 TyrGln Gly Phe Ala Phe Gln Pro Leu Glx 530 535 19 1644 DNA Acinetobactersp. SE19 CDS (1)..(1644) 19 atg gag att atc atg tca caa aaa atg gat tttgat gct atc gtg att 48 Met Glu Ile Ile Met Ser Gln Lys Met Asp Phe AspAla Ile Val Ile 1 5 10 15 ggt ggt ggt ttt ggc gga ctt tat gca gtc aaaaaa tta aga gac gag 96 Gly Gly Gly Phe Gly Gly Leu Tyr Ala Val Lys LysLeu Arg Asp Glu 20 25 30 ctc gaa ctt aag gtt cag gct ttt gat aaa gcc acggat gtc gca ggt 144 Leu Glu Leu Lys Val Gln Ala Phe Asp Lys Ala Thr AspVal Ala Gly 35 40 45 act tgg tac tgg aac cgt tac cca ggt gca ttg tcg gataca gaa acc 192 Thr Trp Tyr Trp Asn Arg Tyr Pro Gly Ala Leu Ser Asp ThrGlu Thr 50 55 60 cac ctc tac tgc tat tct tgg gat aaa gaa tta cta caa tcgcta gaa 240 His Leu Tyr Cys Tyr Ser Trp Asp Lys Glu Leu Leu Gln Ser LeuGlu 65 70 75 80 atc aag aaa aaa tat gtg caa ggc cct gat gta cgc aag tattta cag 288 Ile Lys Lys Lys Tyr Val Gln Gly Pro Asp Val Arg Lys Tyr LeuGln 85 90 95 caa gtg gct gaa aag cat gat tta aag aag agc tat caa ttc aatacc 336 Gln Val Ala Glu Lys His Asp Leu Lys Lys Ser Tyr Gln Phe Asn Thr100 105 110 gcg gtt caa tcg gct cat tac aac gaa gca gat gcc ttg tgg gaagtc 384 Ala Val Gln Ser Ala His Tyr Asn Glu Ala Asp Ala Leu Trp Glu Val115 120 125 acc act gaa tat ggt gat aag tac acg gcg cgt ttc ctc atc actgct 432 Thr Thr Glu Tyr Gly Asp Lys Tyr Thr Ala Arg Phe Leu Ile Thr Ala130 135 140 tta ggc tta ttg tct gcg cct aac ttg cca aac atc aaa ggc attaat 480 Leu Gly Leu Leu Ser Ala Pro Asn Leu Pro Asn Ile Lys Gly Ile Asn145 150 155 160 cag ttt aaa ggt gag ctg cat cat acc agc cgc tgg cca gatgac gta 528 Gln Phe Lys Gly Glu Leu His His Thr Ser Arg Trp Pro Asp AspVal 165 170 175 agt ttt gaa ggt aaa cgt gtc ggc gtg att ggt acg ggt tccacc ggt 576 Ser Phe Glu Gly Lys Arg Val Gly Val Ile Gly Thr Gly Ser ThrGly 180 185 190 gtt cag gtt att acg gct gtg gca cct ctg gct aaa cac ctcact gtc 624 Val Gln Val Ile Thr Ala Val Ala Pro Leu Ala Lys His Leu ThrVal 195 200 205 ttc cag cgt tct gca caa tac agc gtt cca att ggc aat gatcca ctg 672 Phe Gln Arg Ser Ala Gln Tyr Ser Val Pro Ile Gly Asn Asp ProLeu 210 215 220 tct gaa gaa gat gtt aaa aag atc aaa gac aat tat gac aaaatt tgg 720 Ser Glu Glu Asp Val Lys Lys Ile Lys Asp Asn Tyr Asp Lys IleTrp 225 230 235 240 gat ggt gta tgg aat tca gcc ctt gcc ttt ggc ctg aatgaa agc aca 768 Asp Gly Val Trp Asn Ser Ala Leu Ala Phe Gly Leu Asn GluSer Thr 245 250 255 gtg cca gca atg agc gta tca gct gaa gaa cgc aag gcagtt ttt gaa 816 Val Pro Ala Met Ser Val Ser Ala Glu Glu Arg Lys Ala ValPhe Glu 260 265 270 aag gca tgg caa aca ggt ggc ggt ttc cgt ttc atg tttgaa act ttc 864 Lys Ala Trp Gln Thr Gly Gly Gly Phe Arg Phe Met Phe GluThr Phe 275 280 285 ggt gat att gcc acc aat atg gaa gcc aat atc gaa gcgcaa aat ttc 912 Gly Asp Ile Ala Thr Asn Met Glu Ala Asn Ile Glu Ala GlnAsn Phe 290 295 300 att aag ggt aaa att gct gaa atc gtc aaa gat cca gccatt gca cag 960 Ile Lys Gly Lys Ile Ala Glu Ile Val Lys Asp Pro Ala IleAla Gln 305 310 315 320 aag ctt atg cca cag gat ttg tat gca aaa cgt ccgttg tgt gac agt 1008 Lys Leu Met Pro Gln Asp Leu Tyr Ala Lys Arg Pro LeuCys Asp Ser 325 330 335 ggt tac tac aac acc ttt aac cgt gac aat gtc cgttta gaa gat gtg 1056 Gly Tyr Tyr Asn Thr Phe Asn Arg Asp Asn Val Arg LeuGlu Asp Val 340 345 350 aaa gcc aat ccg att gtt gaa att acc gaa aac ggtgtg aaa ctc gaa 1104 Lys Ala Asn Pro Ile Val Glu Ile Thr Glu Asn Gly ValLys Leu Glu 355 360 365 aat ggc gat ttc gtt gaa tta gac atg ctg ata tgtgcc aca ggt ttt 1152 Asn Gly Asp Phe Val Glu Leu Asp Met Leu Ile Cys AlaThr Gly Phe 370 375 380 gat gcc gtc gat ggc aac tat gtg cgc atg gac attcaa ggt aaa aac 1200 Asp Ala Val Asp Gly Asn Tyr Val Arg Met Asp Ile GlnGly Lys Asn 385 390 395 400 ggc ttg gcc atg aaa gac tac tgg aaa gaa ggtccg tcg agc tat atg 1248 Gly Leu Ala Met Lys Asp Tyr Trp Lys Glu Gly ProSer Ser Tyr Met 405 410 415 ggt gtc acc gta aat aac tat cca aac atg ttcatg gtg ctt gga ccg 1296 Gly Val Thr Val Asn Asn Tyr Pro Asn Met Phe MetVal Leu Gly Pro 420 425 430 aat ggc ccg ttt acc aac ctg ccg cca tca attgaa tca cag gtg gaa 1344 Asn Gly Pro Phe Thr Asn Leu Pro Pro Ser Ile GluSer Gln Val Glu 435 440 445 tgg atc agt gat acc att caa tac acg gtt gaaaac aat gtt gaa tcc 1392 Trp Ile Ser Asp Thr Ile Gln Tyr Thr Val Glu AsnAsn Val Glu Ser 450 455 460 att gaa gcg aca aaa gaa gcg gaa gaa caa tggact caa act tgc gcc 1440 Ile Glu Ala Thr Lys Glu Ala Glu Glu Gln Trp ThrGln Thr Cys Ala 465 470 475 480 aat att gcg gaa atg acc tta ttc cct aaagcg caa tcc tgg att ttt 1488 Asn Ile Ala Glu Met Thr Leu Phe Pro Lys AlaGln Ser Trp Ile Phe 485 490 495 ggt gcg aat atc ccg ggc aag aaa aac acggtt tac ttc tat ctc ggt 1536 Gly Ala Asn Ile Pro Gly Lys Lys Asn Thr ValTyr Phe Tyr Leu Gly 500 505 510 ggt tta aaa gaa tat cgc agt gcg cta gccaac tgc aaa aac cat gcc 1584 Gly Leu Lys Glu Tyr Arg Ser Ala Leu Ala AsnCys Lys Asn His Ala 515 520 525 tat gaa ggt ttt gat att caa tta caa cgttca gat atc aag caa cct 1632 Tyr Glu Gly Phe Asp Ile Gln Leu Gln Arg SerAsp Ile Lys Gln Pro 530 535 540 gcc aat gcc taa 1644 Ala Asn Ala 545 20547 PRT Acinetobacter sp. SE19 20 Met Glu Ile Ile Met Ser Gln Lys MetAsp Phe Asp Ala Ile Val Ile 1 5 10 15 Gly Gly Gly Phe Gly Gly Leu TyrAla Val Lys Lys Leu Arg Asp Glu 20 25 30 Leu Glu Leu Lys Val Gln Ala PheAsp Lys Ala Thr Asp Val Ala Gly 35 40 45 Thr Trp Tyr Trp Asn Arg Tyr ProGly Ala Leu Ser Asp Thr Glu Thr 50 55 60 His Leu Tyr Cys Tyr Ser Trp AspLys Glu Leu Leu Gln Ser Leu Glu 65 70 75 80 Ile Lys Lys Lys Tyr Val GlnGly Pro Asp Val Arg Lys Tyr Leu Gln 85 90 95 Gln Val Ala Glu Lys His AspLeu Lys Lys Ser Tyr Gln Phe Asn Thr 100 105 110 Ala Val Gln Ser Ala HisTyr Asn Glu Ala Asp Ala Leu Trp Glu Val 115 120 125 Thr Thr Glu Tyr GlyAsp Lys Tyr Thr Ala Arg Phe Leu Ile Thr Ala 130 135 140 Leu Gly Leu LeuSer Ala Pro Asn Leu Pro Asn Ile Lys Gly Ile Asn 145 150 155 160 Gln PheLys Gly Glu Leu His His Thr Ser Arg Trp Pro Asp Asp Val 165 170 175 SerPhe Glu Gly Lys Arg Val Gly Val Ile Gly Thr Gly Ser Thr Gly 180 185 190Val Gln Val Ile Thr Ala Val Ala Pro Leu Ala Lys His Leu Thr Val 195 200205 Phe Gln Arg Ser Ala Gln Tyr Ser Val Pro Ile Gly Asn Asp Pro Leu 210215 220 Ser Glu Glu Asp Val Lys Lys Ile Lys Asp Asn Tyr Asp Lys Ile Trp225 230 235 240 Asp Gly Val Trp Asn Ser Ala Leu Ala Phe Gly Leu Asn GluSer Thr 245 250 255 Val Pro Ala Met Ser Val Ser Ala Glu Glu Arg Lys AlaVal Phe Glu 260 265 270 Lys Ala Trp Gln Thr Gly Gly Gly Phe Arg Phe MetPhe Glu Thr Phe 275 280 285 Gly Asp Ile Ala Thr Asn Met Glu Ala Asn IleGlu Ala Gln Asn Phe 290 295 300 Ile Lys Gly Lys Ile Ala Glu Ile Val LysAsp Pro Ala Ile Ala Gln 305 310 315 320 Lys Leu Met Pro Gln Asp Leu TyrAla Lys Arg Pro Leu Cys Asp Ser 325 330 335 Gly Tyr Tyr Asn Thr Phe AsnArg Asp Asn Val Arg Leu Glu Asp Val 340 345 350 Lys Ala Asn Pro Ile ValGlu Ile Thr Glu Asn Gly Val Lys Leu Glu 355 360 365 Asn Gly Asp Phe ValGlu Leu Asp Met Leu Ile Cys Ala Thr Gly Phe 370 375 380 Asp Ala Val AspGly Asn Tyr Val Arg Met Asp Ile Gln Gly Lys Asn 385 390 395 400 Gly LeuAla Met Lys Asp Tyr Trp Lys Glu Gly Pro Ser Ser Tyr Met 405 410 415 GlyVal Thr Val Asn Asn Tyr Pro Asn Met Phe Met Val Leu Gly Pro 420 425 430Asn Gly Pro Phe Thr Asn Leu Pro Pro Ser Ile Glu Ser Gln Val Glu 435 440445 Trp Ile Ser Asp Thr Ile Gln Tyr Thr Val Glu Asn Asn Val Glu Ser 450455 460 Ile Glu Ala Thr Lys Glu Ala Glu Glu Gln Trp Thr Gln Thr Cys Ala465 470 475 480 Asn Ile Ala Glu Met Thr Leu Phe Pro Lys Ala Gln Ser TrpIle Phe 485 490 495 Gly Ala Asn Ile Pro Gly Lys Lys Asn Thr Val Tyr PheTyr Leu Gly 500 505 510 Gly Leu Lys Glu Tyr Arg Ser Ala Leu Ala Asn CysLys Asn His Ala 515 520 525 Tyr Glu Gly Phe Asp Ile Gln Leu Gln Arg SerAsp Ile Lys Gln Pro 530 535 540 Ala Asn Ala 545 21 1320 DNA Rhodococcuserythropolis AN12 21 atgagcacag agggcaagta cgcgctgatc ggagcgggtccgtctggatt ggccggcgcg 60 cgaaacctcg atcgagccgg catagcgttc gacggcttcgagagccacga cgacgtcggt 120 gggctctggg acatcgacaa cccgcacagc accgtctacgagtcggcgca cctcatttcg 180 tcgaagggca ccaccgcatt cgcggagttc ccgatggcggattcggttgc cgactacccg 240 agccacatcg aacttgccga gtatttccgc gactacgccgatacccacga tcttcgcagg 300 cactttgcct tcggcactac cgtcatcgac gttttgccggtcgattcgct gtggcaggtc 360 accacgcgta gtcgcagcgg tgagacttca gtcgcgcggtatcgaggcgt gatcatcgcg 420 aacggaacgc tgtcgaagcc gaacataccg acgttccggggcgacttcac cggcacgttg 480 atgcacacga gcgagtaccg cagtgccgag atcttccgcggaaagagagt gctggtcatc 540 ggagcgggca acagtggatg cgacatcgcc gtcgatgccgtccaccaggc cgagtgcgtc 600 gatttgagcg ttcggcgagg ctactacttc gtccccaagtatctgttcgg gcgaccctcg 660 gacacgttga atcagggaaa gccgttgccg ccgtggatcaaacaacgcgt cgacaccttg 720 ttactcaagc agttcacggg agatccggtg cggttcggatttccggcacc ggactacaag 780 atctacgaat cgcatccggt cgtgaactcg ttgatcctgcaccacatcgg gcacggtgac 840 gtgcacgtgc gcgccgacgt cgaccggttc gaggggaagacggtgcggtt tgtcgacgga 900 tcgtctgccg actacgacct cgttctctgc gccacggggtatcacctcga ctatcccttc 960 atcgcgcgcg aggacctgga ctggtcgggt gctgccccggacctgttcct caacgtcgcg 1020 agtcgccgcc acgacaatct ctttgttctc ggcatggtcgaagcatccgg tctcgggtgg 1080 cagggtcgtt accagcaggc cgagttggtg gccaaattgatcaccgcacg caccgaagcc 1140 cccgccgcgg cgcgcgaatt ctcggcagcg gcggccggccctcctcccga tctgtccggg 1200 ggatacaagt acctgaagct gggacgaatg gcctactacgtgaacaagga cgcctaccga 1260 tcggcgatca gacggcacat cggactgctc gatgccgctctgacgaaggg aggtcagtga 1320 22 439 PRT Rhodococcus erythropolis AN12 22Met Ser Thr Glu Gly Lys Tyr Ala Leu Ile Gly Ala Gly Pro Ser Gly 1 5 1015 Leu Ala Gly Ala Arg Asn Leu Asp Arg Ala Gly Ile Ala Phe Asp Gly 20 2530 Phe Glu Ser His Asp Asp Val Gly Gly Leu Trp Asp Ile Asp Asn Pro 35 4045 His Ser Thr Val Tyr Glu Ser Ala His Leu Ile Ser Ser Lys Gly Thr 50 5560 Thr Ala Phe Ala Glu Phe Pro Met Ala Asp Ser Val Ala Asp Tyr Pro 65 7075 80 Ser His Ile Glu Leu Ala Glu Tyr Phe Arg Asp Tyr Ala Asp Thr His 8590 95 Asp Leu Arg Arg His Phe Ala Phe Gly Thr Thr Val Ile Asp Val Leu100 105 110 Pro Val Asp Ser Leu Trp Gln Val Thr Thr Arg Ser Arg Ser GlyGlu 115 120 125 Thr Ser Val Ala Arg Tyr Arg Gly Val Ile Ile Ala Asn GlyThr Leu 130 135 140 Ser Lys Pro Asn Ile Pro Thr Phe Arg Gly Asp Phe ThrGly Thr Leu 145 150 155 160 Met His Thr Ser Glu Tyr Arg Ser Ala Glu IlePhe Arg Gly Lys Arg 165 170 175 Val Leu Val Ile Gly Ala Gly Asn Ser GlyCys Asp Ile Ala Val Asp 180 185 190 Ala Val His Gln Ala Glu Cys Val AspLeu Ser Val Arg Arg Gly Tyr 195 200 205 Tyr Phe Val Pro Lys Tyr Leu PheGly Arg Pro Ser Asp Thr Leu Asn 210 215 220 Gln Gly Lys Pro Leu Pro ProTrp Ile Lys Gln Arg Val Asp Thr Leu 225 230 235 240 Leu Leu Lys Gln PheThr Gly Asp Pro Val Arg Phe Gly Phe Pro Ala 245 250 255 Pro Asp Tyr LysIle Tyr Glu Ser His Pro Val Val Asn Ser Leu Ile 260 265 270 Leu His HisIle Gly His Gly Asp Val His Val Arg Ala Asp Val Asp 275 280 285 Arg PheGlu Gly Lys Thr Val Arg Phe Val Asp Gly Ser Ser Ala Asp 290 295 300 TyrAsp Leu Val Leu Cys Ala Thr Gly Tyr His Leu Asp Tyr Pro Phe 305 310 315320 Ile Ala Arg Glu Asp Leu Asp Trp Ser Gly Ala Ala Pro Asp Leu Phe 325330 335 Leu Asn Val Ala Ser Arg Arg His Asp Asn Leu Phe Val Leu Gly Met340 345 350 Val Glu Ala Ser Gly Leu Gly Trp Gln Gly Arg Tyr Gln Gln AlaGlu 355 360 365 Leu Val Ala Lys Leu Ile Thr Ala Arg Thr Glu Ala Pro AlaAla Ala 370 375 380 Arg Glu Phe Ser Ala Ala Ala Ala Gly Pro Pro Pro AspLeu Ser Gly 385 390 395 400 Gly Tyr Lys Tyr Leu Lys Leu Gly Arg Met AlaTyr Tyr Val Asn Lys 405 410 415 Asp Ala Tyr Arg Ser Ala Ile Arg Arg HisIle Gly Leu Leu Asp Ala 420 425 430 Ala Leu Thr Lys Gly Gly Gln 435 231557 DNA Rhodococcus erythropolis AN12 23 atggtcgaca tcgacccaacctcggggcca tcggccggtg acgaggaaac tcgaactcgc 60 cgaacacgag tcgtcgtcatcggagccggt ttcggcggca tcggaacggc tgtccgcttg 120 aagcagtccg ggatcgacgacttcgtcgtt ctggaacgtg ccgcggagcc cggggggacc 180 tggcaggtca atacctaccccggtgcacag tgcgacatcc cgtcgattct gtactcgttc 240 tcgtttgcgc ccaatccgaactggacgcgg ctgtatcccc tgcagcccga gatctacgac 300 tatctccggg attgcgtccatcgcttcgga ctggccggtc atttccactg caaccaggac 360 gtgacagaag cttcgtgggacgagcaagcc cagatctggc gggtacacac tgcggaaacc 420 gtctgggagg cacagttcctggtcgcggcc accggcccgt tcagtgcccc cgccacaccc 480 gaccttcccg ggctcgaatcgtttcgtggt cagatgttcc acaccgcgga ctggaaccac 540 gaccacgacc ttcgcggtgagcggatagcc gtggtcggca ccggcgcctc tgcggtgcag 600 atcatcccca gactgcaaccgctcgcggac acgttgaccg tgttccagcg gacaccgacg 660 tggatcctgc cgcatccggatcagccgatg accggctggc caagcgctct cttcgagcgg 720 gtgccgctca cccaacgactggcacgcaag ggactcgacc tgcttcaaga agccctggta 780 cccggattcg tgtacaagccgtcactgctc aaagggctgg ccgcactcgg ccgagcacac 840 cttcgccggc aggtgcgggacccggagctt cgcgcaaagc tgctccccca ctacgcattc 900 ggatgcaagc gtccgacgttctcgaacacc tactatcccg cgctggcgtc acccaatgtg 960 gaggtggtga cggacggaatcgtcgaggtg caggagcgcg gagttctcac cgcggacggc 1020 gccttccggg aagtcgacaccatagtcatg ggaaccggct ttcggatggg agacaacccg 1080 tcgttcgaca ccatccgaggccaggacggc cgcagcctcg cacagacgtg gaacggcagt 1140 gccgaggcct tcctcggcaccactatcagc ggttttccga acttcttcat gatcctcggc 1200 cccaattccg tggtctacacctcacaggtc gtcacgatcg aagcccaggt cgagtacatc 1260 gtgagctgca ttcttcaaatggacgagcgc ggcatcggca gcatcgacgt ccgcgcagac 1320 gtgcaacgcg agttcgtacgcgcgacagac cgccgactcg ccaccagcgt gtggaacgcc 1380 ggcgggtgca gtagttactacctcgtcgac ggcggtcgca actacacctt ctatcccgga 1440 ttcaaccgat cattccgggccaggaccaaa cgagccgacc tcgctcacta cgcgcaggta 1500 caacccgtct cgtccgcagcactcaccact gctcgagaaa ccgtgaggag ccgataa 1557 24 518 PRT Rhodococcuserythropolis AN12 24 Met Val Asp Ile Asp Pro Thr Ser Gly Pro Ser Ala GlyAsp Glu Glu 1 5 10 15 Thr Arg Thr Arg Arg Thr Arg Val Val Val Ile GlyAla Gly Phe Gly 20 25 30 Gly Ile Gly Thr Ala Val Arg Leu Lys Gln Ser GlyIle Asp Asp Phe 35 40 45 Val Val Leu Glu Arg Ala Ala Glu Pro Gly Gly ThrTrp Gln Val Asn 50 55 60 Thr Tyr Pro Gly Ala Gln Cys Asp Ile Pro Ser IleLeu Tyr Ser Phe 65 70 75 80 Ser Phe Ala Pro Asn Pro Asn Trp Thr Arg LeuTyr Pro Leu Gln Pro 85 90 95 Glu Ile Tyr Asp Tyr Leu Arg Asp Cys Val HisArg Phe Gly Leu Ala 100 105 110 Gly His Phe His Cys Asn Gln Asp Val ThrGlu Ala Ser Trp Asp Glu 115 120 125 Gln Ala Gln Ile Trp Arg Val His ThrAla Glu Thr Val Trp Glu Ala 130 135 140 Gln Phe Leu Val Ala Ala Thr GlyPro Phe Ser Ala Pro Ala Thr Pro 145 150 155 160 Asp Leu Pro Gly Leu GluSer Phe Arg Gly Gln Met Phe His Thr Ala 165 170 175 Asp Trp Asn His AspHis Asp Leu Arg Gly Glu Arg Ile Ala Val Val 180 185 190 Gly Thr Gly AlaSer Ala Val Gln Ile Ile Pro Arg Leu Gln Pro Leu 195 200 205 Ala Asp ThrLeu Thr Val Phe Gln Arg Thr Pro Thr Trp Ile Leu Pro 210 215 220 His ProAsp Gln Pro Met Thr Gly Trp Pro Ser Ala Leu Phe Glu Arg 225 230 235 240Val Pro Leu Thr Gln Arg Leu Ala Arg Lys Gly Leu Asp Leu Leu Gln 245 250255 Glu Ala Leu Val Pro Gly Phe Val Tyr Lys Pro Ser Leu Leu Lys Gly 260265 270 Leu Ala Ala Leu Gly Arg Ala His Leu Arg Arg Gln Val Arg Asp Pro275 280 285 Glu Leu Arg Ala Lys Leu Leu Pro His Tyr Ala Phe Gly Cys LysArg 290 295 300 Pro Thr Phe Ser Asn Thr Tyr Tyr Pro Ala Leu Ala Ser ProAsn Val 305 310 315 320 Glu Val Val Thr Asp Gly Ile Val Glu Val Gln GluArg Gly Val Leu 325 330 335 Thr Ala Asp Gly Ala Phe Arg Glu Val Asp ThrIle Val Met Gly Thr 340 345 350 Gly Phe Arg Met Gly Asp Asn Pro Ser PheAsp Thr Ile Arg Gly Gln 355 360 365 Asp Gly Arg Ser Leu Ala Gln Thr TrpAsn Gly Ser Ala Glu Ala Phe 370 375 380 Leu Gly Thr Thr Ile Ser Gly PhePro Asn Phe Phe Met Ile Leu Gly 385 390 395 400 Pro Asn Ser Val Val TyrThr Ser Gln Val Val Thr Ile Glu Ala Gln 405 410 415 Val Glu Tyr Ile ValSer Cys Ile Leu Gln Met Asp Glu Arg Gly Ile 420 425 430 Gly Ser Ile AspVal Arg Ala Asp Val Gln Arg Glu Phe Val Arg Ala 435 440 445 Thr Asp ArgArg Leu Ala Thr Ser Val Trp Asn Ala Gly Gly Cys Ser 450 455 460 Ser TyrTyr Leu Val Asp Gly Gly Arg Asn Tyr Thr Phe Tyr Pro Gly 465 470 475 480Phe Asn Arg Ser Phe Arg Ala Arg Thr Lys Arg Ala Asp Leu Ala His 485 490495 Tyr Ala Gln Val Gln Pro Val Ser Ser Ala Ala Leu Thr Thr Ala Arg 500505 510 Glu Thr Val Arg Ser Arg 515 25 1626 DNA Rhodococcus erythropolisAN12 25 atgaccgatc ctgacttctc caccgcacca ctcgacgtcg tagtcatcggcgccggcgtc 60 gctggcatgt acgccatgca ccgacttcgc gagcaggggc tgcgtgtccacggcttcgag 120 gcgggctccg gagtgggcgg cacgtggtat ttcaaccgct accccggcgcacgctgcgac 180 gtcgagagtt tcgactactc ctactcgttc tccgaagagc tgcaacaggattgggactgg 240 agcgagaagt acgccgcgca accggagatc ctctcgtacc tcgatcacgtggctgatcgc 300 ttcgacctac gcactggctt caccttcgac acacgcgttc tgagcgcacagttcgacgag 360 ggtactgcca cgtggcgagt acagaccgac ggcggtcacg acgtcacctcacgcttcgtc 420 gtgtgcgcca cgggcagcct ctcgaccgca aacgttccga acattgcgggccgtgagacc 480 ttcggtggcg atgtgttcca caccggtttc tggccgcacg agggcgtcgacttcaccggc 540 aaacgcgtcg gcgtgatcgg caccggatcc tcgggcatcc agtccattccgctgatcgcc 600 gagcaggccg atcatctcta cgtgttccag cggtccgcga attacagtgtgccggcagga 660 aacacgcctc tcgatgacaa gcgccgcgcc gagatcaagg ccggctacgcagagcgtcga 720 gcgctgtcca agcgcagtgg cggtggatcg ccgttcgttt cggatcctcgcagcgccctc 780 gaagtctcgg aggccgagag aaacgcggca tacgaggagc ggtggaagctcggcggtgtc 840 ctgttcgcca agacattcgc agaccagacg agcaacatcg aggccaacgggacagcggca 900 gcgtttgccg aacgcaagat tcgctcggaa gtccaggatc aggcgatcgccgacctgctc 960 attccgaacg accaccccat cggaaccaag cggatagtca cggacacgaactactaccag 1020 agctacaacc gtgacaacgt cagcctggta gatctcaagt ccgcaccgatcgaggcgatc 1080 gacgaggctg gaatcaagac ggccgatgcg cactacgaac tggatgcgctggtgtttgcc 1140 accgggttcg acgcgatgac gggagcgctc gatcgcatcg agatccgcggccgcaatggc 1200 gagacgttgc gcgagaactg gcatgcgggt ccaaggacgt atctaggcctcggagtacac 1260 gggttcccca acctgttcat cgtcaccggg ccgggtagcc cgagtgtgctgtccaacatg 1320 attctcgctg ccgagcagca cgtggactgg atcgcgggcg cgatcaaccacctcgattcg 1380 gcgggcatcg acaccatcga accgagtgcc gaagccgtgg acaactggctcgacgaatgc 1440 tcacgccggg cgtcggcgac gctgtttcca tccgcgaact cctggtacatgggagccaac 1500 attccgggaa agccgaggat attcatgcca ttcatcggag gattcggtgtctactccgac 1560 atctgtgcag acgtggcagc agcgggatac cgaggcttcg aactgaacagtgcggtgcac 1620 gcatga 1626 26 541 PRT Rhodococcus erythropolis AN12 26Met Thr Asp Pro Asp Phe Ser Thr Ala Pro Leu Asp Val Val Val Ile 1 5 1015 Gly Ala Gly Val Ala Gly Met Tyr Ala Met His Arg Leu Arg Glu Gln 20 2530 Gly Leu Arg Val His Gly Phe Glu Ala Gly Ser Gly Val Gly Gly Thr 35 4045 Trp Tyr Phe Asn Arg Tyr Pro Gly Ala Arg Cys Asp Val Glu Ser Phe 50 5560 Asp Tyr Ser Tyr Ser Phe Ser Glu Glu Leu Gln Gln Asp Trp Asp Trp 65 7075 80 Ser Glu Lys Tyr Ala Ala Gln Pro Glu Ile Leu Ser Tyr Leu Asp His 8590 95 Val Ala Asp Arg Phe Asp Leu Arg Thr Gly Phe Thr Phe Asp Thr Arg100 105 110 Val Leu Ser Ala Gln Phe Asp Glu Gly Thr Ala Thr Trp Arg ValGln 115 120 125 Thr Asp Gly Gly His Asp Val Thr Ser Arg Phe Val Val CysAla Thr 130 135 140 Gly Ser Leu Ser Thr Ala Asn Val Pro Asn Ile Ala GlyArg Glu Thr 145 150 155 160 Phe Gly Gly Asp Val Phe His Thr Gly Phe TrpPro His Glu Gly Val 165 170 175 Asp Phe Thr Gly Lys Arg Val Gly Val IleGly Thr Gly Ser Ser Gly 180 185 190 Ile Gln Ser Ile Pro Leu Ile Ala GluGln Ala Asp His Leu Tyr Val 195 200 205 Phe Gln Arg Ser Ala Asn Tyr SerVal Pro Ala Gly Asn Thr Pro Leu 210 215 220 Asp Asp Lys Arg Arg Ala GluIle Lys Ala Gly Tyr Ala Glu Arg Arg 225 230 235 240 Ala Leu Ser Lys ArgSer Gly Gly Gly Ser Pro Phe Val Ser Asp Pro 245 250 255 Arg Ser Ala LeuGlu Val Ser Glu Ala Glu Arg Asn Ala Ala Tyr Glu 260 265 270 Glu Arg TrpLys Leu Gly Gly Val Leu Phe Ala Lys Thr Phe Ala Asp 275 280 285 Gln ThrSer Asn Ile Glu Ala Asn Gly Thr Ala Ala Ala Phe Ala Glu 290 295 300 ArgLys Ile Arg Ser Glu Val Gln Asp Gln Ala Ile Ala Asp Leu Leu 305 310 315320 Ile Pro Asn Asp His Pro Ile Gly Thr Lys Arg Ile Val Thr Asp Thr 325330 335 Asn Tyr Tyr Gln Ser Tyr Asn Arg Asp Asn Val Ser Leu Val Asp Leu340 345 350 Lys Ser Ala Pro Ile Glu Ala Ile Asp Glu Ala Gly Ile Lys ThrAla 355 360 365 Asp Ala His Tyr Glu Leu Asp Ala Leu Val Phe Ala Thr GlyPhe Asp 370 375 380 Ala Met Thr Gly Ala Leu Asp Arg Ile Glu Ile Arg GlyArg Asn Gly 385 390 395 400 Glu Thr Leu Arg Glu Asn Trp His Ala Gly ProArg Thr Tyr Leu Gly 405 410 415 Leu Gly Val His Gly Phe Pro Asn Leu PheIle Val Thr Gly Pro Gly 420 425 430 Ser Pro Ser Val Leu Ser Asn Met IleLeu Ala Ala Glu Gln His Val 435 440 445 Asp Trp Ile Ala Gly Ala Ile AsnHis Leu Asp Ser Ala Gly Ile Asp 450 455 460 Thr Ile Glu Pro Ser Ala GluAla Val Asp Asn Trp Leu Asp Glu Cys 465 470 475 480 Ser Arg Arg Ala SerAla Thr Leu Phe Pro Ser Ala Asn Ser Trp Tyr 485 490 495 Met Gly Ala AsnIle Pro Gly Lys Pro Arg Ile Phe Met Pro Phe Ile 500 505 510 Gly Gly PheGly Val Tyr Ser Asp Ile Cys Ala Asp Val Ala Ala Ala 515 520 525 Gly TyrArg Gly Phe Glu Leu Asn Ser Ala Val His Ala 530 535 540 27 1389 DNARhodococcus erythropolis AN12 27 atgagcccct cccccttgcc gagcgtctgcatcatcggcg ccgggcctac cggaatcacc 60 acggccaagc gaatgaagga attcggaatacccttcgact gctacgaagc gtccgacgag 120 gtcggcggaa actggtacta caagaaccccaacggaatgt cggcctgcta ccagagcctg 180 catatcgaca cgtcgaagtg gcgcttggcattcgaggact tcccggtctc tgccgacctt 240 cccgatttcc cccaccattc cgaactcttccagtacttca aggactacgt cgagcatttc 300 ggcctgcgtg agtcgatcat cttcaacaccagtgttgttg ctgcagagcg tgatgcaaac 360 ggactgtgga ccgtcacgcg ctcggacggcgaagtccgta cctacgacgt cctgatggtc 420 tgcaatggtc accactggga tcccaatatcccggattacc cgggcgagtt cgacggcgtc 480 ctcatgcaca gccacagcta caacgacccgttcgatccga tcgacatgcg cggcaagaaa 540 gtagtcgtgg tcggaatggg gaactccggcttggacattg cttccgaact ggggcagaga 600 tacctcgccg acaagctcat cgtctcggcgcgccgcggcg tgtgggtgtt gccgaaatac 660 ctgggcggcg tgccgggaga caaactgatcaccccgccct ggatgcctcg ggggctgcgc 720 ctgttcctga gtcgtcgatt cctcggcaagaacctgggaa ccatggaggg ctacggacta 780 cccaagccag atcaccgccc cttcgaggcacatccgtcag ccagtggcga gttcttggga 840 cgtgccgggt ccggcgacat caccttcaagccggcgatca ccaaactcga cggaaagcag 900 gttcatttcg ccgacggcac cgccgaggacgtcgacgtgg tcgtctgcgc caccggctac 960 aacatcagct tccccttctt cgacgacccgaacctgctgc cggacaaaga caaccgattc 1020 ccactcttca aacgcatgat gaagcccggaatcgacaacc tcttcttcat gggactcgct 1080 cagcccatgc cgacgctcgt aaacttcgccgagcagcaga gcaagctcgt cgcggcctac 1140 ctcaccggta aataccagct gccgtccgcgaacgagatgc aggagatcac caaggccgac 1200 gaggcgtact tcctcgcccc ctattacaagtcaccgcgcc acaccattca gctcgagttc 1260 gacccgtacg tccgcaacat gaacaaggaaattgccaagg gcaccaagcg tgccgcggcc 1320 tcggggaaca aactacctgt tgcggcgcgtgcagcagcac acgaactcga gaaggcggat 1380 cgcgcatga 1389 28 462 PRTRhodococcus erythropolis AN12 28 Met Ser Pro Ser Pro Leu Pro Ser Val CysIle Ile Gly Ala Gly Pro 1 5 10 15 Thr Gly Ile Thr Thr Ala Lys Arg MetLys Glu Phe Gly Ile Pro Phe 20 25 30 Asp Cys Tyr Glu Ala Ser Asp Glu ValGly Gly Asn Trp Tyr Tyr Lys 35 40 45 Asn Pro Asn Gly Met Ser Ala Cys TyrGln Ser Leu His Ile Asp Thr 50 55 60 Ser Lys Trp Arg Leu Ala Phe Glu AspPhe Pro Val Ser Ala Asp Leu 65 70 75 80 Pro Asp Phe Pro His His Ser GluLeu Phe Gln Tyr Phe Lys Asp Tyr 85 90 95 Val Glu His Phe Gly Leu Arg GluSer Ile Ile Phe Asn Thr Ser Val 100 105 110 Val Ala Ala Glu Arg Asp AlaAsn Gly Leu Trp Thr Val Thr Arg Ser 115 120 125 Asp Gly Glu Val Arg ThrTyr Asp Val Leu Met Val Cys Asn Gly His 130 135 140 His Trp Asp Pro AsnIle Pro Asp Tyr Pro Gly Glu Phe Asp Gly Val 145 150 155 160 Leu Met HisSer His Ser Tyr Asn Asp Pro Phe Asp Pro Ile Asp Met 165 170 175 Arg GlyLys Lys Val Val Val Val Gly Met Gly Asn Ser Gly Leu Asp 180 185 190 IleAla Ser Glu Leu Gly Gln Arg Tyr Leu Ala Asp Lys Leu Ile Val 195 200 205Ser Ala Arg Arg Gly Val Trp Val Leu Pro Lys Tyr Leu Gly Gly Val 210 215220 Pro Gly Asp Lys Leu Ile Thr Pro Pro Trp Met Pro Arg Gly Leu Arg 225230 235 240 Leu Phe Leu Ser Arg Arg Phe Leu Gly Lys Asn Leu Gly Thr MetGlu 245 250 255 Gly Tyr Gly Leu Pro Lys Pro Asp His Arg Pro Phe Glu AlaHis Pro 260 265 270 Ser Ala Ser Gly Glu Phe Leu Gly Arg Ala Gly Ser GlyAsp Ile Thr 275 280 285 Phe Lys Pro Ala Ile Thr Lys Leu Asp Gly Lys GlnVal His Phe Ala 290 295 300 Asp Gly Thr Ala Glu Asp Val Asp Val Val ValCys Ala Thr Gly Tyr 305 310 315 320 Asn Ile Ser Phe Pro Phe Phe Asp AspPro Asn Leu Leu Pro Asp Lys 325 330 335 Asp Asn Arg Phe Pro Leu Phe LysArg Met Met Lys Pro Gly Ile Asp 340 345 350 Asn Leu Phe Phe Met Gly LeuAla Gln Pro Met Pro Thr Leu Val Asn 355 360 365 Phe Ala Glu Gln Gln SerLys Leu Val Ala Ala Tyr Leu Thr Gly Lys 370 375 380 Tyr Gln Leu Pro SerAla Asn Glu Met Gln Glu Ile Thr Lys Ala Asp 385 390 395 400 Glu Ala TyrPhe Leu Ala Pro Tyr Tyr Lys Ser Pro Arg His Thr Ile 405 410 415 Gln LeuGlu Phe Asp Pro Tyr Val Arg Asn Met Asn Lys Glu Ile Ala 420 425 430 LysGly Thr Lys Arg Ala Ala Ala Ser Gly Asn Lys Leu Pro Val Ala 435 440 445Ala Arg Ala Ala Ala His Glu Leu Glu Lys Ala Asp Arg Ala 450 455 460 291572 DNA Rhodococcus erythropolis AN12 29 gtgaacaacg aatctgaccacttcgaggtc gtgatcatcg gcggtggaat ttccggaatc 60 ggcgcggcta tccacctgcagcgtctcgga atcgacaact tcgcactcct cgagaaggcc 120 gactccctcg gtggaacctggcgcgccaac acctatcccg ggtgcgcctg cgacgttcca 180 tccggtctgt actcgtactcctttgccgcc aatccggatt ggacgcgctt gttcgcggag 240 caaccggaga tccgcgaatacatcgagaac acggcgggca cgcacggagt cgacaaacac 300 gttcgcttcg gggtcgaaatgctctccgcg cgatgggatg cgtcgcaatc actgtggaag 360 ataacaactt ccagcggcgaactgactgct cgcttcgtga tagccgctgc cggcccatgg 420 aacgaacccc tgacaccggcgatccccgga ctggaagcgt tcgagggaga ggtgtttcat 480 tcctcgcagt ggaatcacgactacgacctg accggaaaac tcgtcgccgt cgtaggaacc 540 ggagcgtcgg cagtccagttcgttccgcgc atcgtctccc aggtctccgc ccttcacctc 600 taccagcgaa ccgctcaatgggttctcccc aaacccgatc actacgtacc gcggatcgaa 660 aggtccgtca tgcgattcgtgccgggagca cagaaagcct tgcgcagcat cgaatacgga 720 atcatggaag cgctcggattgggattccgt aatccatgga tcctgcgaat cgtgcagaaa 780 ctcgggtcag cccaattgcgcctacaggta cgcgatccga agctgcgcaa ggcattgact 840 cccgactaca ccctcggttgcaagcgactg ctcatgtcga actcgtacta tccggccctc 900 ggcaaaccca acgtcagcgtccatgccaac gccgtcgagc agatccgcgg taacaccgtg 960 atcggcgccg acggagtggaggcggaggtg gacgccatca tcttcggaac gggcttccac 1020 atcctcgaca tgcccatcgcatccaaggta ttcgacggag aaggtcgatc actcgacgat 1080 cattggcagg gaagcccgcaggcgtacttc ggctccgccg tcagtggatt ccccaacgca 1140 ttcatcctgc tgggcccgagcctcggcacc gggcacacat cggcgttcat gatcttggaa 1200 gcccaactga actatgtggcgcaggcaatc ggccacgccc gtcgtcacgg ctggcagacc 1260 atcgacgtgc gagaggaagttcaggcagcc ttcaattctc aggttcagga ggcattgggg 1320 accacggtct acaacgccggtggttgcgaa agctatttct tcgacgtcaa cggccgcaac 1380 agtttcaact ggccgtggtcgtccggcgcc atgcgtcgac ggctacggga cttcgatccg 1440 tatgcctaca accacacgtcgaaccctgag tcagacaaca cgccccctga acccacgcca 1500 tccgaaccca cgccatctgaacccacgcca tccgagccca ccaccagtcc ggaaccggag 1560 tacaccgcat ga 1572 30523 PRT Rhodococcus erythropolis AN12 30 Val Asn Asn Glu Ser Asp His PheGlu Val Val Ile Ile Gly Gly Gly 1 5 10 15 Ile Ser Gly Ile Gly Ala AlaIle His Leu Gln Arg Leu Gly Ile Asp 20 25 30 Asn Phe Ala Leu Leu Glu LysAla Asp Ser Leu Gly Gly Thr Trp Arg 35 40 45 Ala Asn Thr Tyr Pro Gly CysAla Cys Asp Val Pro Ser Gly Leu Tyr 50 55 60 Ser Tyr Ser Phe Ala Ala AsnPro Asp Trp Thr Arg Leu Phe Ala Glu 65 70 75 80 Gln Pro Glu Ile Arg GluTyr Ile Glu Asn Thr Ala Gly Thr His Gly 85 90 95 Val Asp Lys His Val ArgPhe Gly Val Glu Met Leu Ser Ala Arg Trp 100 105 110 Asp Ala Ser Gln SerLeu Trp Lys Ile Thr Thr Ser Ser Gly Glu Leu 115 120 125 Thr Ala Arg PheVal Ile Ala Ala Ala Gly Pro Trp Asn Glu Pro Leu 130 135 140 Thr Pro AlaIle Pro Gly Leu Glu Ala Phe Glu Gly Glu Val Phe His 145 150 155 160 SerSer Gln Trp Asn His Asp Tyr Asp Leu Thr Gly Lys Leu Val Ala 165 170 175Val Val Gly Thr Gly Ala Ser Ala Val Gln Phe Val Pro Arg Ile Val 180 185190 Ser Gln Val Ser Ala Leu His Leu Tyr Gln Arg Thr Ala Gln Trp Val 195200 205 Leu Pro Lys Pro Asp His Tyr Val Pro Arg Ile Glu Arg Ser Val Met210 215 220 Arg Phe Val Pro Gly Ala Gln Lys Ala Leu Arg Ser Ile Glu TyrGly 225 230 235 240 Ile Met Glu Ala Leu Gly Leu Gly Phe Arg Asn Pro TrpIle Leu Arg 245 250 255 Ile Val Gln Lys Leu Gly Ser Ala Gln Leu Arg LeuGln Val Arg Asp 260 265 270 Pro Lys Leu Arg Lys Ala Leu Thr Pro Asp TyrThr Leu Gly Cys Lys 275 280 285 Arg Leu Leu Met Ser Asn Ser Tyr Tyr ProAla Leu Gly Lys Pro Asn 290 295 300 Val Ser Val His Ala Asn Ala Val GluGln Ile Arg Gly Asn Thr Val 305 310 315 320 Ile Gly Ala Asp Gly Val GluAla Glu Val Asp Ala Ile Ile Phe Gly 325 330 335 Thr Gly Phe His Ile LeuAsp Met Pro Ile Ala Ser Lys Val Phe Asp 340 345 350 Gly Glu Gly Arg SerLeu Asp Asp His Trp Gln Gly Ser Pro Gln Ala 355 360 365 Tyr Phe Gly SerAla Val Ser Gly Phe Pro Asn Ala Phe Ile Leu Leu 370 375 380 Gly Pro SerLeu Gly Thr Gly His Thr Ser Ala Phe Met Ile Leu Glu 385 390 395 400 AlaGln Leu Asn Tyr Val Ala Gln Ala Ile Gly His Ala Arg Arg His 405 410 415Gly Trp Gln Thr Ile Asp Val Arg Glu Glu Val Gln Ala Ala Phe Asn 420 425430 Ser Gln Val Gln Glu Ala Leu Gly Thr Thr Val Tyr Asn Ala Gly Gly 435440 445 Cys Glu Ser Tyr Phe Phe Asp Val Asn Gly Arg Asn Ser Phe Asn Trp450 455 460 Pro Trp Ser Ser Gly Ala Met Arg Arg Arg Leu Arg Asp Phe AspPro 465 470 475 480 Tyr Ala Tyr Asn His Thr Ser Asn Pro Glu Ser Asp AsnThr Pro Pro 485 490 495 Glu Pro Thr Pro Ser Glu Pro Thr Pro Ser Glu ProThr Pro Ser Glu 500 505 510 Pro Thr Thr Ser Pro Glu Pro Glu Tyr Thr Ala515 520 31 1482 DNA Rhodococcus erythropolis AN12 31 atgagcaccgaacacctcga tgtcctgatc gtcggcgccg gcttgtccgg catcggtgct 60 gcttatcgactccagaccga gctcccagga aagtcgtacg caatcctcga ggcccgagcg 120 aacagcggcggaacctggga cctcttcaag tatcccggca tccgatcgga ttccgacatg 180 ttcacgctcggctacccgtt tcgcccgtgg acagatgcca aagcaatcgc cgacggtgat 240 tcgatcctgcggtacgtgcg cgacaccgcg cgagagaacg ggatcgacaa gaagattcgg 300 tacaaccggaaggtgacggc cgcatcatgg tcgtcagcga cctcgacctg gacagtcacg 360 gtcacgaccggcgacgaaga cgaaacattg acctgtaact tcctctatct ctgcagcggg 420 tactacagctacgacggcgg atacaccccc gacttccccg gacgtgaatc gtttgccggt 480 gaggtagtgcacccccagtt ctggcccgaa gaactcgatt actccgacaa gaaggtcgtt 540 gtgatcggaagcggcgccac cgcagtcact ttggtcccca cgatgtcacg ggacgcaagc 600 cacgtcacgatgctccagcg atcaccgacg tacattctgg cgcttccgtc cagcgacaaa 660 ttatcggacaccattcgcgc ggtactgccg aatcaactcg cgcacagcat cgctcgatgg 720 aagagcgtcgtagtgaacct gagtttctac caactgtgcc gacgcagtcc ggcgcgtgca 780 aagaggatgctgaacctcgc gatcagtcgt caactcccga aagacatccc cctcgatcct 840 cacttcacaccctcctacga tccctgggac cagcgcttgt gcgtcgtacc cgacggcgat 900 ttgttcaaagccctccgatc cggcaaggcc tcgatcgaga ccgatcacat cgacaccttc 960 accgagaccgggatccttct cgcgtcaggt cgcgaactcg aagctgacat catcgtcact 1020 gcaacaggattgaagatgga ggcgtgcggc gggatgtcca tcgaagtgga cggcgaactc 1080 gtcaccctcggtgatcgtta cgcctacaag ggcatgatga tcagcgacgt accgaacttc 1140 gcgatgtgcgtcggctacac caacgcctcg tggactctgc gagcagatct cacgtcgatg 1200 tacgtgtgccgactgctgac ggagatggac aagcgcgact attcgaagtg cgtgccgcac 1260 gcgaccgaagaaatggacca gcggccgatc ctggatctgg cgtcggggta cgtcatgcgt 1320 gccgtggaacagttcccgaa gcagggatcg aagtcaccgt ggaacatgcg tcagaactac 1380 atccttgaccgtcttcactc cacgttcggg agcatcaacg accacatgac gttctcgaag 1440 gcaccagctcgacattcgac gccggtaccg agcaagagtt ga 1482 32 493 PRT Rhodococcuserythropolis AN12 32 Met Ser Thr Glu His Leu Asp Val Leu Ile Val Gly AlaGly Leu Ser 1 5 10 15 Gly Ile Gly Ala Ala Tyr Arg Leu Gln Thr Glu LeuPro Gly Lys Ser 20 25 30 Tyr Ala Ile Leu Glu Ala Arg Ala Asn Ser Gly GlyThr Trp Asp Leu 35 40 45 Phe Lys Tyr Pro Gly Ile Arg Ser Asp Ser Asp MetPhe Thr Leu Gly 50 55 60 Tyr Pro Phe Arg Pro Trp Thr Asp Ala Lys Ala IleAla Asp Gly Asp 65 70 75 80 Ser Ile Leu Arg Tyr Val Arg Asp Thr Ala ArgGlu Asn Gly Ile Asp 85 90 95 Lys Lys Ile Arg Tyr Asn Arg Lys Val Thr AlaAla Ser Trp Ser Ser 100 105 110 Ala Thr Ser Thr Trp Thr Val Thr Val ThrThr Gly Asp Glu Asp Glu 115 120 125 Thr Leu Thr Cys Asn Phe Leu Tyr LeuCys Ser Gly Tyr Tyr Ser Tyr 130 135 140 Asp Gly Gly Tyr Thr Pro Asp PhePro Gly Arg Glu Ser Phe Ala Gly 145 150 155 160 Glu Val Val His Pro GlnPhe Trp Pro Glu Glu Leu Asp Tyr Ser Asp 165 170 175 Lys Lys Val Val ValIle Gly Ser Gly Ala Thr Ala Val Thr Leu Val 180 185 190 Pro Thr Met SerArg Asp Ala Ser His Val Thr Met Leu Gln Arg Ser 195 200 205 Pro Thr TyrIle Leu Ala Leu Pro Ser Ser Asp Lys Leu Ser Asp Thr 210 215 220 Ile ArgAla Val Leu Pro Asn Gln Leu Ala His Ser Ile Ala Arg Trp 225 230 235 240Lys Ser Val Val Val Asn Leu Ser Phe Tyr Gln Leu Cys Arg Arg Ser 245 250255 Pro Ala Arg Ala Lys Arg Met Leu Asn Leu Ala Ile Ser Arg Gln Leu 260265 270 Pro Lys Asp Ile Pro Leu Asp Pro His Phe Thr Pro Ser Tyr Asp Pro275 280 285 Trp Asp Gln Arg Leu Cys Val Val Pro Asp Gly Asp Leu Phe LysAla 290 295 300 Leu Arg Ser Gly Lys Ala Ser Ile Glu Thr Asp His Ile AspThr Phe 305 310 315 320 Thr Glu Thr Gly Ile Leu Leu Ala Ser Gly Arg GluLeu Glu Ala Asp 325 330 335 Ile Ile Val Thr Ala Thr Gly Leu Lys Met GluAla Cys Gly Gly Met 340 345 350 Ser Ile Glu Val Asp Gly Glu Leu Val ThrLeu Gly Asp Arg Tyr Ala 355 360 365 Tyr Lys Gly Met Met Ile Ser Asp ValPro Asn Phe Ala Met Cys Val 370 375 380 Gly Tyr Thr Asn Ala Ser Trp ThrLeu Arg Ala Asp Leu Thr Ser Met 385 390 395 400 Tyr Val Cys Arg Leu LeuThr Glu Met Asp Lys Arg Asp Tyr Ser Lys 405 410 415 Cys Val Pro His AlaThr Glu Glu Met Asp Gln Arg Pro Ile Leu Asp 420 425 430 Leu Ala Ser GlyTyr Val Met Arg Ala Val Glu Gln Phe Pro Lys Gln 435 440 445 Gly Ser LysSer Pro Trp Asn Met Arg Gln Asn Tyr Ile Leu Asp Arg 450 455 460 Leu HisSer Thr Phe Gly Ser Ile Asn Asp His Met Thr Phe Ser Lys 465 470 475 480Ala Pro Ala Arg His Ser Thr Pro Val Pro Ser Lys Ser 485 490 33 1620 DNARhodococcus erythropolis AN12 33 atgacagacg aattcgacgt agtgatcgtgggtgcaggtc tcgcaggtat gcagatgctg 60 cacgaggttc gcatggtcgg cctcacggccaaagttttcg aggccggcgg aggtgcaggt 120 ggcacctggt attggaaccg ctacccgggtgctcggtgtg acgtggagag tttggagtac 180 tcctatcagt tctccgaggt gctccaacaggaatgggaat ggacccgccg gtacgcagat 240 caggccgaga tcatgcgcta catcagccacgtcgtcgaaa ccttcgacct ggcccgcgac 300 atcaggtttc atacccgggt cgaggcgatgacctacgagg agaccaccgc caggtggacg 360 gttcagacgg acagtgccgg cgaggttgtggccaaattcg tgattatggc caccgggtgt 420 ctgtcggagc cgaacgtgcc gtacataccgggtgtggaga cattcgcggg cgacgtgctg 480 cacaccgggc gctggccgca ggatcccgtcgacttcacag gcaagcgggt cggcgtgatc 540 ggaaccggat catctggcgt gcaagccatcccactcatcg cgcggcaagc ggccgagctc 600 gtagtctttc agcgcactcc tgcatacacgttgcccgctg tcgacgagcc gctcgacccg 660 gaattgcagg cggcgatcaa ggccgattacagggggttcc gtgcgcgaaa caacgaagtg 720 cccaccgcgg gactctcccg atttccgacgaatccgaact cggttttcct gttctcaacg 780 aaggagcggg atgccatcct cgaacacaattggaaccgag gcgggccgtt gatgctgcgc 840 gccttcggcg atctgctggt ggactcagccgctaacgagg tggtagccga gttcgtccgc 900 aacaagatcc gccagatcgt taccgaccccgaggtcgctg cgaagctcac accgacacac 960 gtgatcggat gcaaacgaat ctgtctcagcgacggctatt acgagaccta caaccgggtc 1020 aacgtgcgct tagtcgacat caaacgccacccaatcgagg agatcacgcc tactacagcc 1080 cggaccggcg aggactcgca tgacctggacatgctcgtgt tcgccactgg ctacgatgcc 1140 atcactggcg cactctcacg catcgacatccgcggccgcg cagggttgtc attgcaggaa 1200 gcatggtcgg acggaccgcg cacctatctcgggctcgggg tctccggctt cccaaatctg 1260 ttcatcatga ccggccccgg aagcccatcggtattgacca atgttcttgt cgccatacac 1320 caacatgcga catggatcgg cgaatgcctgaagcatatga ccgacaacga tattcggaca 1380 atggaagcca cgcccgaagc cgagcagaactggggggacc acgtgcgcga cctcgccgag 1440 cagaccctgc tctcatcgtg cgggtcctggtacctcggag caaacatccc cggtaagaga 1500 caagtattca tgccgctggt cgggtttccggactacgcca agaaatgcgc ggaaatcgca 1560 tccgccggct acccgggctt cgccttccagtacgaccccg tccctgtgaa ccagagctga 1620 34 539 PRT Rhodococcuserythropolis AN12 34 Met Thr Asp Glu Phe Asp Val Val Ile Val Gly Ala GlyLeu Ala Gly 1 5 10 15 Met Gln Met Leu His Glu Val Arg Met Val Gly LeuThr Ala Lys Val 20 25 30 Phe Glu Ala Gly Gly Gly Ala Gly Gly Thr Trp TyrTrp Asn Arg Tyr 35 40 45 Pro Gly Ala Arg Cys Asp Val Glu Ser Leu Glu TyrSer Tyr Gln Phe 50 55 60 Ser Glu Val Leu Gln Gln Glu Trp Glu Trp Thr ArgArg Tyr Ala Asp 65 70 75 80 Gln Ala Glu Ile Met Arg Tyr Ile Ser His ValVal Glu Thr Phe Asp 85 90 95 Leu Ala Arg Asp Ile Arg Phe His Thr Arg ValGlu Ala Met Thr Tyr 100 105 110 Glu Glu Thr Thr Ala Arg Trp Thr Val GlnThr Asp Ser Ala Gly Glu 115 120 125 Val Val Ala Lys Phe Val Ile Met AlaThr Gly Cys Leu Ser Glu Pro 130 135 140 Asn Val Pro Tyr Ile Pro Gly ValGlu Thr Phe Ala Gly Asp Val Leu 145 150 155 160 His Thr Gly Arg Trp ProGln Asp Pro Val Asp Phe Thr Gly Lys Arg 165 170 175 Val Gly Val Ile GlyThr Gly Ser Ser Gly Val Gln Ala Ile Pro Leu 180 185 190 Ile Ala Arg GlnAla Ala Glu Leu Val Val Phe Gln Arg Thr Pro Ala 195 200 205 Tyr Thr LeuPro Ala Val Asp Glu Pro Leu Asp Pro Glu Leu Gln Ala 210 215 220 Ala IleLys Ala Asp Tyr Arg Gly Phe Arg Ala Arg Asn Asn Glu Val 225 230 235 240Pro Thr Ala Gly Leu Ser Arg Phe Pro Thr Asn Pro Asn Ser Val Phe 245 250255 Leu Phe Ser Thr Lys Glu Arg Asp Ala Ile Leu Glu His Asn Trp Asn 260265 270 Arg Gly Gly Pro Leu Met Leu Arg Ala Phe Gly Asp Leu Leu Val Asp275 280 285 Ser Ala Ala Asn Glu Val Val Ala Glu Phe Val Arg Asn Lys IleArg 290 295 300 Gln Ile Val Thr Asp Pro Glu Val Ala Ala Lys Leu Thr ProThr His 305 310 315 320 Val Ile Gly Cys Lys Arg Ile Cys Leu Ser Asp GlyTyr Tyr Glu Thr 325 330 335 Tyr Asn Arg Val Asn Val Arg Leu Val Asp IleLys Arg His Pro Ile 340 345 350 Glu Glu Ile Thr Pro Thr Thr Ala Arg ThrGly Glu Asp Ser His Asp 355 360 365 Leu Asp Met Leu Val Phe Ala Thr GlyTyr Asp Ala Ile Thr Gly Ala 370 375 380 Leu Ser Arg Ile Asp Ile Arg GlyArg Ala Gly Leu Ser Leu Gln Glu 385 390 395 400 Ala Trp Ser Asp Gly ProArg Thr Tyr Leu Gly Leu Gly Val Ser Gly 405 410 415 Phe Pro Asn Leu PheIle Met Thr Gly Pro Gly Ser Pro Ser Val Leu 420 425 430 Thr Asn Val LeuVal Ala Ile His Gln His Ala Thr Trp Ile Gly Glu 435 440 445 Cys Leu LysHis Met Thr Asp Asn Asp Ile Arg Thr Met Glu Ala Thr 450 455 460 Pro GluAla Glu Gln Asn Trp Gly Asp His Val Arg Asp Leu Ala Glu 465 470 475 480Gln Thr Leu Leu Ser Ser Cys Gly Ser Trp Tyr Leu Gly Ala Asn Ile 485 490495 Pro Gly Lys Arg Gln Val Phe Met Pro Leu Val Gly Phe Pro Asp Tyr 500505 510 Ala Lys Lys Cys Ala Glu Ile Ala Ser Ala Gly Tyr Pro Gly Phe Ala515 520 525 Phe Gln Tyr Asp Pro Val Pro Val Asn Gln Ser 530 535 35 1950DNA Rhodococcus erythropolis AN12 35 atgactatcg tcactgacct ggaccgtgaccacctgcgtt cggcggtgtt acggggcaat 60 gttccgacca tgctcgccgt gttgctggagctgaccgccg atgagcggtg ggtggcaccc 120 cgctatcaac ccacgcgcag tcggggcatggatgacaatt ccacgggagg acttccggag 180 gaggttcagt ccgaaatccg gagcgcgttgatcgacgcag tggaacgctg gtggacgctg 240 gacgagccgt cccggcggac gctggacagctcggaagtag agcgaatcct caacttcacc 300 tgcagcgaga ccgtaccgcc ggacttcgcgccgatgatgg cggagatagt caatggtccg 360 cagatcaagc ctgccaccgc caagtgcgacgagcgactcc acgccatcgt gatcggcgcc 420 ggcatcgcgg ggatgctggc ctccgtcgagctcagccgcg ctgggatccc tcacgtgatc 480 ctggagaaga acgacgacgt cggcggatcatggtgggaga accgctatcc gggcgccgga 540 gttgatacac cgagccacct ttactcgatctcgtcgttcc ctcgtaactg gtcgacccac 600 ttcggcaagc gcgacgaggt tcagggatatctcgaggact ttgcggaggc caacgacatc 660 cggcgcaatg tccgcttccg tcatgaggtgacgcgcgccg agttcgagga gtcgaaacag 720 agttggcgtg tgtccgtcca gcgaccaggtgaggcgtcgg agaccctcga ggctcccatc 780 ctgatcagcg cggtcggtct gctcaatcgtccgaagatcc cgcatctacc gggaatcgag 840 accttccgtg gtcgcctctt ccactccgccgagtggccga gcgagctcga cgatcccgag 900 tcgctccgcg gaaagcgagt gggcatcgtcggtaccggag ccagtgctat gcagatcggc 960 ccggccatcg cggatcgtgt cggatcgctgacgatcttcc agcgctcacc acagtggatc 1020 gcaccgaacg acgactactt cacgaccatcgacgacggcg tccactggct gatggacaac 1080 atccccggct atcgcgagtg gtaccgggcgcgtctgtcgt ggatcttcaa cgacaaggtg 1140 tactcgtccc tccaggtcga ccccgactggccagagccga gcgcctcgat caatgcgacc 1200 aaccatggtc atcgcaagtt ctacgaacgctatctccgcg atcagctggg tgatcgaaca 1260 gatctgatcg aggcatctct tccggactatccgccctttg gtaagcgaat gctgctggac 1320 aatggctggt tcacgatgct tcgtaagcccgacgtcacac tggtgcccca cggagtcgac 1380 gccctgacac cttctggact cgtcgacacgaacggcgtcg agcaccagct ggacgtcatt 1440 gtcatggcga cgggtttcca cagtgtgcgcgttctttacc cgatggacat cgtcggtcga 1500 tccggccggt ccaccggaga aatctggggcgagcacgacg cgcgcgccta cctggggatc 1560 acagttcctg acttccccaa tttcttcgtcatgaccggac cgaacaccgg cctgggacat 1620 ggggggagct tcatcacgat cctggaatgtcaggtccgct acatcatgga tgccttgaag 1680 ttgatgcaat cggaaaacct cggcgcgatggagtgccggg ccgaggtcaa cgatcgatac 1740 aacgaggccg tcgaccgaca gcacgcacagatggtctgga cccatccggc aatggagaac 1800 tggtaccgaa acccggacgg tcgcgtcgtgtcggtccttc cgtggcggat caacgactac 1860 tgggccatga cctaccgagt cgacccgtcagattttcgta ccgagccggc acgctccgag 1920 tcggtcccga ctccgaccgc gcgagggtga1950 36 649 PRT Rhodococcus erythropolis AN12 36 Met Thr Ile Val Thr AspLeu Asp Arg Asp His Leu Arg Ser Ala Val 1 5 10 15 Leu Arg Gly Asn ValPro Thr Met Leu Ala Val Leu Leu Glu Leu Thr 20 25 30 Ala Asp Glu Arg TrpVal Ala Pro Arg Tyr Gln Pro Thr Arg Ser Arg 35 40 45 Gly Met Asp Asp AsnSer Thr Gly Gly Leu Pro Glu Glu Val Gln Ser 50 55 60 Glu Ile Arg Ser AlaLeu Ile Asp Ala Val Glu Arg Trp Trp Thr Leu 65 70 75 80 Asp Glu Pro SerArg Arg Thr Leu Asp Ser Ser Glu Val Glu Arg Ile 85 90 95 Leu Asn Phe ThrCys Ser Glu Thr Val Pro Pro Asp Phe Ala Pro Met 100 105 110 Met Ala GluIle Val Asn Gly Pro Gln Ile Lys Pro Ala Thr Ala Lys 115 120 125 Cys AspGlu Arg Leu His Ala Ile Val Ile Gly Ala Gly Ile Ala Gly 130 135 140 MetLeu Ala Ser Val Glu Leu Ser Arg Ala Gly Ile Pro His Val Ile 145 150 155160 Leu Glu Lys Asn Asp Asp Val Gly Gly Ser Trp Trp Glu Asn Arg Tyr 165170 175 Pro Gly Ala Gly Val Asp Thr Pro Ser His Leu Tyr Ser Ile Ser Ser180 185 190 Phe Pro Arg Asn Trp Ser Thr His Phe Gly Lys Arg Asp Glu ValGln 195 200 205 Gly Tyr Leu Glu Asp Phe Ala Glu Ala Asn Asp Ile Arg ArgAsn Val 210 215 220 Arg Phe Arg His Glu Val Thr Arg Ala Glu Phe Glu GluSer Lys Gln 225 230 235 240 Ser Trp Arg Val Ser Val Gln Arg Pro Gly GluAla Ser Glu Thr Leu 245 250 255 Glu Ala Pro Ile Leu Ile Ser Ala Val GlyLeu Leu Asn Arg Pro Lys 260 265 270 Ile Pro His Leu Pro Gly Ile Glu ThrPhe Arg Gly Arg Leu Phe His 275 280 285 Ser Ala Glu Trp Pro Ser Glu LeuAsp Asp Pro Glu Ser Leu Arg Gly 290 295 300 Lys Arg Val Gly Ile Val GlyThr Gly Ala Ser Ala Met Gln Ile Gly 305 310 315 320 Pro Ala Ile Ala AspArg Val Gly Ser Leu Thr Ile Phe Gln Arg Ser 325 330 335 Pro Gln Trp IleAla Pro Asn Asp Asp Tyr Phe Thr Thr Ile Asp Asp 340 345 350 Gly Val HisTrp Leu Met Asp Asn Ile Pro Gly Tyr Arg Glu Trp Tyr 355 360 365 Arg AlaArg Leu Ser Trp Ile Phe Asn Asp Lys Val Tyr Ser Ser Leu 370 375 380 GlnVal Asp Pro Asp Trp Pro Glu Pro Ser Ala Ser Ile Asn Ala Thr 385 390 395400 Asn His Gly His Arg Lys Phe Tyr Glu Arg Tyr Leu Arg Asp Gln Leu 405410 415 Gly Asp Arg Thr Asp Leu Ile Glu Ala Ser Leu Pro Asp Tyr Pro Pro420 425 430 Phe Gly Lys Arg Met Leu Leu Asp Asn Gly Trp Phe Thr Met LeuArg 435 440 445 Lys Pro Asp Val Thr Leu Val Pro His Gly Val Asp Ala LeuThr Pro 450 455 460 Ser Gly Leu Val Asp Thr Asn Gly Val Glu His Gln LeuAsp Val Ile 465 470 475 480 Val Met Ala Thr Gly Phe His Ser Val Arg ValLeu Tyr Pro Met Asp 485 490 495 Ile Val Gly Arg Ser Gly Arg Ser Thr GlyGlu Ile Trp Gly Glu His 500 505 510 Asp Ala Arg Ala Tyr Leu Gly Ile ThrVal Pro Asp Phe Pro Asn Phe 515 520 525 Phe Val Met Thr Gly Pro Asn ThrGly Leu Gly His Gly Gly Ser Phe 530 535 540 Ile Thr Ile Leu Glu Cys GlnVal Arg Tyr Ile Met Asp Ala Leu Lys 545 550 555 560 Leu Met Gln Ser GluAsn Leu Gly Ala Met Glu Cys Arg Ala Glu Val 565 570 575 Asn Asp Arg TyrAsn Glu Ala Val Asp Arg Gln His Ala Gln Met Val 580 585 590 Trp Thr HisPro Ala Met Glu Asn Trp Tyr Arg Asn Pro Asp Gly Arg 595 600 605 Val ValSer Val Leu Pro Trp Arg Ile Asn Asp Tyr Trp Ala Met Thr 610 615 620 TyrArg Val Asp Pro Ser Asp Phe Arg Thr Glu Pro Ala Arg Ser Glu 625 630 635640 Ser Val Pro Thr Pro Thr Ala Arg Gly 645 37 1485 DNA Rhodococcuserythropolis AN12 37 gtgaagcttc ccgaacatgt cgaaacattg atcgtcggtgccggattcgc cggtatgggc 60 ttggcggcca gaatgcttcg tgacaaccga acggcggacgtcgtgttgat cgagcgcgga 120 gctgatatcg gtggcacctg gcgagacaac acctacccaggttgtgcctg tgacgtgccg 180 acggcgctgt actcgtattc ttttgcgccg agcgctgattggagtcatac ctttgctcgt 240 cagcccgaga tctacgacta tctgaagaaa gtggccgcagacaccggcat cggggatcgc 300 gtaatcctga actgcgaact cgaagccgct gtgtgggacgaggatgcggc gctgtggcgg 360 gtccggacat ccctggggtc gttgacagtc aaagcgctggtcgctgcgac cggggcgttg 420 tcgacaccca agatcccgga ttttcccggt ctcgaccaattctccggtac cactttccat 480 tcggcgacgt ggaaccacga acacgaactg cgtggtgagcgcgtagccgt gatcggaacg 540 ggagcgtcgg cggttcagtt cgttcccgaa attgccgaccctgctgccca tgtcaccgtg 600 ttccagagaa ctccggcctg ggtgattccg cgaatggatcgcaccctgcc tgcggcgcag 660 aaggccgtct actcgcggat tcccgctacg cagaaagttgttcgcggagc ggtttacggt 720 tttcgcgagt tgctcggtgc cgcgatgtca catgcgacgtgggtcctgcc ggccttcgag 780 gcggccgcgc gcctccatct gcgcagacag gtgaaagatccggagttgcg ccggaaactg 840 actcccgatt tcacgatcgg ttgcaagcgc atgcttctgtccaacgactg gttgcgcacc 900 ctcgaccgcg cggacgtgag cctggtcgac agcgggctcgtctcggtcac cgagggcggg 960 gtggtcgacg ggcacggagt cgagcacaag gtcgacaccatcatcttcgc cacggggttc 1020 acgccgacgg aaccgcctgt ggcgcatctg atcaccggaaaacgtggcga aacgctggcc 1080 gcgcattgga acggtagccc caatgcctac aagggcactgcggtcagcgg gttcccgaat 1140 ctgttcctca tgtacggtcc gaacaccaac ctcggacacagttcgatcgt gtacatgctc 1200 gagtcccagg ccgagtacgt caacgacgcg ttgaacaccatgaaacgtga gcgactggac 1260 gctcttgatg tcaacgagtc ggtacaggtg cactacaacaagggaattca gcacgagttg 1320 cagcacacgg tgtggaacaa gggcggatgc tcgagttggtacatcgatcc ggaggggcgc 1380 aactcggtgc agtggccgac gttcacattc aaattccgttcgctgctgga gcatttcgat 1440 cgtgagaact actccgctcg caagatcgaa agcgtccaggcatga 1485 38 494 PRT Rhodococcus erythropolis AN12 38 Val Lys Leu ProGlu His Val Glu Thr Leu Ile Val Gly Ala Gly Phe 1 5 10 15 Ala Gly MetGly Leu Ala Ala Arg Met Leu Arg Asp Asn Arg Thr Ala 20 25 30 Asp Val ValLeu Ile Glu Arg Gly Ala Asp Ile Gly Gly Thr Trp Arg 35 40 45 Asp Asn ThrTyr Pro Gly Cys Ala Cys Asp Val Pro Thr Ala Leu Tyr 50 55 60 Ser Tyr SerPhe Ala Pro Ser Ala Asp Trp Ser His Thr Phe Ala Arg 65 70 75 80 Gln ProGlu Ile Tyr Asp Tyr Leu Lys Lys Val Ala Ala Asp Thr Gly 85 90 95 Ile GlyAsp Arg Val Ile Leu Asn Cys Glu Leu Glu Ala Ala Val Trp 100 105 110 AspGlu Asp Ala Ala Leu Trp Arg Val Arg Thr Ser Leu Gly Ser Leu 115 120 125Thr Val Lys Ala Leu Val Ala Ala Thr Gly Ala Leu Ser Thr Pro Lys 130 135140 Ile Pro Asp Phe Pro Gly Leu Asp Gln Phe Ser Gly Thr Thr Phe His 145150 155 160 Ser Ala Thr Trp Asn His Glu His Glu Leu Arg Gly Glu Arg ValAla 165 170 175 Val Ile Gly Thr Gly Ala Ser Ala Val Gln Phe Val Pro GluIle Ala 180 185 190 Asp Pro Ala Ala His Val Thr Val Phe Gln Arg Thr ProAla Trp Val 195 200 205 Ile Pro Arg Met Asp Arg Thr Leu Pro Ala Ala GlnLys Ala Val Tyr 210 215 220 Ser Arg Ile Pro Ala Thr Gln Lys Val Val ArgGly Ala Val Tyr Gly 225 230 235 240 Phe Arg Glu Leu Leu Gly Ala Ala MetSer His Ala Thr Trp Val Leu 245 250 255 Pro Ala Phe Glu Ala Ala Ala ArgLeu His Leu Arg Arg Gln Val Lys 260 265 270 Asp Pro Glu Leu Arg Arg LysLeu Thr Pro Asp Phe Thr Ile Gly Cys 275 280 285 Lys Arg Met Leu Leu SerAsn Asp Trp Leu Arg Thr Leu Asp Arg Ala 290 295 300 Asp Val Ser Leu ValAsp Ser Gly Leu Val Ser Val Thr Glu Gly Gly 305 310 315 320 Val Val AspGly His Gly Val Glu His Lys Val Asp Thr Ile Ile Phe 325 330 335 Ala ThrGly Phe Thr Pro Thr Glu Pro Pro Val Ala His Leu Ile Thr 340 345 350 GlyLys Arg Gly Glu Thr Leu Ala Ala His Trp Asn Gly Ser Pro Asn 355 360 365Ala Tyr Lys Gly Thr Ala Val Ser Gly Phe Pro Asn Leu Phe Leu Met 370 375380 Tyr Gly Pro Asn Thr Asn Leu Gly His Ser Ser Ile Val Tyr Met Leu 385390 395 400 Glu Ser Gln Ala Glu Tyr Val Asn Asp Ala Leu Asn Thr Met LysArg 405 410 415 Glu Arg Leu Asp Ala Leu Asp Val Asn Glu Ser Val Gln ValHis Tyr 420 425 430 Asn Lys Gly Ile Gln His Glu Leu Gln His Thr Val TrpAsn Lys Gly 435 440 445 Gly Cys Ser Ser Trp Tyr Ile Asp Pro Glu Gly ArgAsn Ser Val Gln 450 455 460 Trp Pro Thr Phe Thr Phe Lys Phe Arg Ser LeuLeu Glu His Phe Asp 465 470 475 480 Arg Glu Asn Tyr Ser Ala Arg Lys IleGlu Ser Val Gln Ala 485 490 39 1500 DNA Rhodococcus erythropolis AN12 39atgacacagc atgtcgacgt actgatcatc ggcgctggct tgtccggaat cggcgcggct 60tgccacctca ttcgtgagca gaccggaagc acttacgcga tcctcgagcg ccgcgagaac 120atcggtggca cctgggacct gttcaagtac ccgggcatcc gttcggactc cgacatgctc 180accttcggat tcggtttccg tccttggatc ggcaccaaag tgctcgcaga cggcgccagt 240atccgtgact acgtcgagga aaccgccaag gaatacggcg tcaccgacca catcaacttc 300ggccgcaagg tcgtggctat ggacttcgac cgtaccgccg cgcagtggtc cgtgaccgtc 360ctggtcgagg cgacagggga gaccgagacg tggaccgcga acgtcctcgt cggcgcctgt 420ggttactaca actacgacaa gggttaccgc cccgccttcc ccggtgagga cgacttccgc 480ggtcagatcg tgcacccgca gcactggccg gaggatctcg attacaccgg aaagaaggta 540gtggtcatcg gttccggcgc caccgcgatc acgctgatcc cgtcgatggc ccccaccgcc 600ggtcacgtca ccatgctgca gcgctcgccc acgtggatcc aggcgcttcc gtccgaggac 660cctgttgcca agggtctcaa gctcgcacgc gttcccgacc agattgctta caagattggt 720cgagcccgca atatcgcact gcaacgcgcc agctttcagc tttctcgcac caacccgaag 780ctggccaaga agctgttcct cgcccagatc cgcctgcagc tcggcaagaa cgtggacctg 840cgtcacttca ctcccagcta caacccgtgg gatcagcgcc tgtgcgtggt tcccaacggg 900gacctgttca aggtgctcaa gagcggcaag gccgacatcg tcaccgaccg tatcgccacg 960ttcaccgaga agggcatcgt gaccgagtcg ggccgcgaaa tcgaggccga cgtcatcgtc 1020acggcgaccg gcttgaacgt acagattctg ggcggcgcaa ccatgagcat cgacggcgag 1080ccggtcaagc tcaacgagac tgtggcctac aagagcgtgc tctactccga catcccgaac 1140ttcctgatga tcctcggcta caccaacgcg tcgtggacgc tcaaggctga cctggccgcg 1200tcctatctgt gtcgcgtgct caagatcatg cgcgatcgca gctacacgac tttcgaggtt 1260cacgccgaac ccgaggactt cgccgaagaa tctctcatgg gcggagccct gacctcgggc 1320tacatccagc gcggcgacgg agaaatgccg cgtcagggtg cccgcggcgc gtggaaagtg 1380gtcaacaatt actaccgcga ccgcaagctg atgcacgacg ccgagatcga agacggtgtg 1440ctgcagttca gcaaggtcga tattgctgtc gtgcctgata gcaaggtcgc cagcgcatag 150040 499 PRT Rhodococcus erythropolis AN12 40 Met Thr Gln His Val Asp ValLeu Ile Ile Gly Ala Gly Leu Ser Gly 1 5 10 15 Ile Gly Ala Ala Cys HisLeu Ile Arg Glu Gln Thr Gly Ser Thr Tyr 20 25 30 Ala Ile Leu Glu Arg ArgGlu Asn Ile Gly Gly Thr Trp Asp Leu Phe 35 40 45 Lys Tyr Pro Gly Ile ArgSer Asp Ser Asp Met Leu Thr Phe Gly Phe 50 55 60 Gly Phe Arg Pro Trp IleGly Thr Lys Val Leu Ala Asp Gly Ala Ser 65 70 75 80 Ile Arg Asp Tyr ValGlu Glu Thr Ala Lys Glu Tyr Gly Val Thr Asp 85 90 95 His Ile Asn Phe GlyArg Lys Val Val Ala Met Asp Phe Asp Arg Thr 100 105 110 Ala Ala Gln TrpSer Val Thr Val Leu Val Glu Ala Thr Gly Glu Thr 115 120 125 Glu Thr TrpThr Ala Asn Val Leu Val Gly Ala Cys Gly Tyr Tyr Asn 130 135 140 Tyr AspLys Gly Tyr Arg Pro Ala Phe Pro Gly Glu Asp Asp Phe Arg 145 150 155 160Gly Gln Ile Val His Pro Gln His Trp Pro Glu Asp Leu Asp Tyr Thr 165 170175 Gly Lys Lys Val Val Val Ile Gly Ser Gly Ala Thr Ala Ile Thr Leu 180185 190 Ile Pro Ser Met Ala Pro Thr Ala Gly His Val Thr Met Leu Gln Arg195 200 205 Ser Pro Thr Trp Ile Gln Ala Leu Pro Ser Glu Asp Pro Val AlaLys 210 215 220 Gly Leu Lys Leu Ala Arg Val Pro Asp Gln Ile Ala Tyr LysIle Gly 225 230 235 240 Arg Ala Arg Asn Ile Ala Leu Gln Arg Ala Ser PheGln Leu Ser Arg 245 250 255 Thr Asn Pro Lys Leu Ala Lys Lys Leu Phe LeuAla Gln Ile Arg Leu 260 265 270 Gln Leu Gly Lys Asn Val Asp Leu Arg HisPhe Thr Pro Ser Tyr Asn 275 280 285 Pro Trp Asp Gln Arg Leu Cys Val ValPro Asn Gly Asp Leu Phe Lys 290 295 300 Val Leu Lys Ser Gly Lys Ala AspIle Val Thr Asp Arg Ile Ala Thr 305 310 315 320 Phe Thr Glu Lys Gly IleVal Thr Glu Ser Gly Arg Glu Ile Glu Ala 325 330 335 Asp Val Ile Val ThrAla Thr Gly Leu Asn Val Gln Ile Leu Gly Gly 340 345 350 Ala Thr Met SerIle Asp Gly Glu Pro Val Lys Leu Asn Glu Thr Val 355 360 365 Ala Tyr LysSer Val Leu Tyr Ser Asp Ile Pro Asn Phe Leu Met Ile 370 375 380 Leu GlyTyr Thr Asn Ala Ser Trp Thr Leu Lys Ala Asp Leu Ala Ala 385 390 395 400Ser Tyr Leu Cys Arg Val Leu Lys Ile Met Arg Asp Arg Ser Tyr Thr 405 410415 Thr Phe Glu Val His Ala Glu Pro Glu Asp Phe Ala Glu Glu Ser Leu 420425 430 Met Gly Gly Ala Leu Thr Ser Gly Tyr Ile Gln Arg Gly Asp Gly Glu435 440 445 Met Pro Arg Gln Gly Ala Arg Gly Ala Trp Lys Val Val Asn AsnTyr 450 455 460 Tyr Arg Asp Arg Lys Leu Met His Asp Ala Glu Ile Glu AspGly Val 465 470 475 480 Leu Gln Phe Ser Lys Val Asp Ile Ala Val Val ProAsp Ser Lys Val 485 490 495 Ala Ser Ala 41 1482 DNA Rhodococcuserythropolis AN12 41 atgtcatcac gggtcaacga cggccacatc gcgatcatcggaaccgggtt ttccgggctg 60 tgcatggcga tcgaactgaa gaagaagggc atcgacgacttcgtcctgta cgaacgcgcc 120 gacgatgtcg gcggaacctg gcgcgacaac acatacccaggggcagcctg cgatgtgccc 180 agcgtgttgt attcctactc cttcgctcag aacccgaactggacccgtat cttcccgcca 240 tggtcggaac tgctcgacta tctcagatct gttgctgcgcagtatgattt gctgccgcac 300 atccgcttcg gtgtcgaggt ctccgaaatg cggttcgacgaggaccggct ccggtggaac 360 atccagttcg catccggcga atcagtgacg gcggccgttgtcgtcaacgg ctcagggggc 420 ttgagtaatc cgtacatccc gcagctaccc ggactggaatcattcgaggg tgccgcattc 480 cactccgcca agtggcgaca tgacctcgac atgtcgggaaggcgtgtcgc ggtgataggt 540 tccggcgcca gtgcgatcca gttcgtcccc gaaatcgccccgcacaccga gacccttcat 600 gtgtttcagc gatcacccaa ctgggtcatg ccacgtggtgatgccgcgct gtcgcccgcc 660 acccgcgaaa gattctcacg gcgtccttat cgtcaacggtggctgcgatg gcggacctac 720 tgggcattcg aaaagctcgc cagcgccttc ctcggaaatcgcaaactcgt cgaacagtac 780 cgatcccagg cgctcgccaa tcttcaacag caagtgccggattcggactt gaggcagaag 840 gtcaccccag attacgatcc tggctgtaaa cgtcgcttgatatccgacga ctggtacccc 900 gcgctgcaac gggaaaatgt gcacttgaac acctcgggggtttccgagat ccgcccgcat 960 tcgatcattg actcagaggg agcggaacac gaagtcgacaccctgatctt cgcgaccgga 1020 ttccaggcaa ccagcttcct ggcaccgatg aaagtattcggccgcgaagg agtcgaactc 1080 tccgacagtt ggcgcgaggg cgccgcaaca aagctcgggcttgcatccgc cgcgttcccg 1140 aacctgtggt tcctcaacgg cccgaatacc ggtctcggtcacaactcgat catcttcatg 1200 atcgaagcac aagccagata catcgcttcg gcagtgcagtacatgcgccg aaaaagtatc 1260 actgccctcg aactcgatcg caccgtccag acaggcagctacgccgccac ccaagaacgc 1320 atgcgccgaa ctgtatgggc atcgggtggc tgcgacagctggtatcaatc cgctgacggt 1380 cgaatcgaca ccctgtggcc ggccagcaca atcgaatactggttgcgcac caggctattc 1440 cgcaagtccg acttccatgc actgacgaca ggcaaaggatga 1482 42 493 PRT Rhodococcus erythropolis AN12 42 Met Ser Ser Arg ValAsn Asp Gly His Ile Ala Ile Ile Gly Thr Gly 1 5 10 15 Phe Ser Gly LeuCys Met Ala Ile Glu Leu Lys Lys Lys Gly Ile Asp 20 25 30 Asp Phe Val LeuTyr Glu Arg Ala Asp Asp Val Gly Gly Thr Trp Arg 35 40 45 Asp Asn Thr TyrPro Gly Ala Ala Cys Asp Val Pro Ser Val Leu Tyr 50 55 60 Ser Tyr Ser PheAla Gln Asn Pro Asn Trp Thr Arg Ile Phe Pro Pro 65 70 75 80 Trp Ser GluLeu Leu Asp Tyr Leu Arg Ser Val Ala Ala Gln Tyr Asp 85 90 95 Leu Leu ProHis Ile Arg Phe Gly Val Glu Val Ser Glu Met Arg Phe 100 105 110 Asp GluAsp Arg Leu Arg Trp Asn Ile Gln Phe Ala Ser Gly Glu Ser 115 120 125 ValThr Ala Ala Val Val Val Asn Gly Ser Gly Gly Leu Ser Asn Pro 130 135 140Tyr Ile Pro Gln Leu Pro Gly Leu Glu Ser Phe Glu Gly Ala Ala Phe 145 150155 160 His Ser Ala Lys Trp Arg His Asp Leu Asp Met Ser Gly Arg Arg Val165 170 175 Ala Val Ile Gly Ser Gly Ala Ser Ala Ile Gln Phe Val Pro GluIle 180 185 190 Ala Pro His Thr Glu Thr Leu His Val Phe Gln Arg Ser ProAsn Trp 195 200 205 Val Met Pro Arg Gly Asp Ala Ala Leu Ser Pro Ala ThrArg Glu Arg 210 215 220 Phe Ser Arg Arg Pro Tyr Arg Gln Arg Trp Leu ArgTrp Arg Thr Tyr 225 230 235 240 Trp Ala Phe Glu Lys Leu Ala Ser Ala PheLeu Gly Asn Arg Lys Leu 245 250 255 Val Glu Gln Tyr Arg Ser Gln Ala LeuAla Asn Leu Gln Gln Gln Val 260 265 270 Pro Asp Ser Asp Leu Arg Gln LysVal Thr Pro Asp Tyr Asp Pro Gly 275 280 285 Cys Lys Arg Arg Leu Ile SerAsp Asp Trp Tyr Pro Ala Leu Gln Arg 290 295 300 Glu Asn Val His Leu AsnThr Ser Gly Val Ser Glu Ile Arg Pro His 305 310 315 320 Ser Ile Ile AspSer Glu Gly Ala Glu His Glu Val Asp Thr Leu Ile 325 330 335 Phe Ala ThrGly Phe Gln Ala Thr Ser Phe Leu Ala Pro Met Lys Val 340 345 350 Phe GlyArg Glu Gly Val Glu Leu Ser Asp Ser Trp Arg Glu Gly Ala 355 360 365 AlaThr Lys Leu Gly Leu Ala Ser Ala Ala Phe Pro Asn Leu Trp Phe 370 375 380Leu Asn Gly Pro Asn Thr Gly Leu Gly His Asn Ser Ile Ile Phe Met 385 390395 400 Ile Glu Ala Gln Ala Arg Tyr Ile Ala Ser Ala Val Gln Tyr Met Arg405 410 415 Arg Lys Ser Ile Thr Ala Leu Glu Leu Asp Arg Thr Val Gln ThrGly 420 425 430 Ser Tyr Ala Ala Thr Gln Glu Arg Met Arg Arg Thr Val TrpAla Ser 435 440 445 Gly Gly Cys Asp Ser Trp Tyr Gln Ser Ala Asp Gly ArgIle Asp Thr 450 455 460 Leu Trp Pro Ala Ser Thr Ile Glu Tyr Trp Leu ArgThr Arg Leu Phe 465 470 475 480 Arg Lys Ser Asp Phe His Ala Leu Thr ThrGly Lys Gly 485 490 43 1626 DNA Rhodococcus erythropolis AN12 43atgactacac aaaaggccct gaccactgtc gatgccatcg tcatcggcgc cggattcggc 60gggatctacg ccgtccacaa actggccaac gagctcggcc tcacgacggt cggcttcgac 120aaggcagacg gcccgggcgg cacgtggtac tggaaccgct acccgggtgc actgtccgac 180accgaaagcc acgtctaccg gttctcattc gaccgtgacc tgcttcagga cggtacctgg 240aagcacacct acaccactca acccgagatt ctcgaatacc ttgaggatgt cgtttcccgg 300ttcgacctac gccggcactt ccacttcggc actgccgtcg aatctgcggt gtatctcgaa 360gacgaacaac tgtgggaagt caccaccgac acaggcgaga tctaccgcgc tacctacgtc 420gtcaatgctg tcgggctcct ctccgccatc aatcgaccgg atctgcccgg tctcgagaca 480ttcgaaggcg agaccatcca caccgcagcg tggcccgagg gcaaggatct caccggccgc 540cgcgtcggcg tgatcggtac cggatctact gggcaacagg tcatcacggc cctggcgcca 600acggtcgaac acctcactgt attcgtgcga actccccagt actcggtgcc ggtcggcaag 660cgcgcggtga ccgacgagca gatcgacgca gtcaaagccg actacgagaa catctggact 720caggtcaaaa gatcctcggt ggcattcggc ttcgaggaat ctactgttcc ggccatgagc 780gtgtccgcgg aagaacgcct cagggtctac gaagaggcat gggagcaggg cggcggtttc 840cgattcatgt tcggaacctt cggtgacatc gctaccgacg aagaagccaa cgaaactgca 900gcatcgttca ttcgctcgaa gatcaccgcc atgatcgaag acccggagac tgcccgcaaa 960ctgacgccca ccggactatt cgcgagacga ccgttgtgcg acgacgggta cttccaggtc 1020ttcaaccgcc cgaacgtcga ggcggtcgcc atcaaggaaa accccattcg tgagatcaca 1080gccaagggcg tggtgaccga ggacggcgtc ctgcacaaat tggacgtcct ggtcctcgcc 1140accggcttcg acgccgtcga cgggaactac cgccgcatga ccatttccgg tcgcggtggc 1200ctgaacatca acgaccattg ggacggccaa cccaccagct acctggggat tgccaccgcg 1260aacttcccca actggttcat ggtgctcggc cccaacggac cgttcacgaa ccttcctcca 1320agcatcgaaa ctcaggtcga gtggatcagc gacaccatag gttacgtcga gcggacaggt 1380gtgcgggcga tcgaacccac accggaggcg gaatccgcat ggaccgcgac ctgcacggac 1440atcgcgaaca tgaccgtctt caccaaggtt gattcatgga tcttcggggc caatgttcca 1500ggaaagaagc ccagcgtgct gttctacctt ggcgggctcg gcaactaccg cgccgtcctg 1560gcagacgtca ccgagggggg ctatcagggc tttgctctga agacggccga caccgtcgac 1620gcctga 1626 44 541 PRT Rhodococcus erythropolis AN12 44 Met Thr Thr GlnLys Ala Leu Thr Thr Val Asp Ala Ile Val Ile Gly 1 5 10 15 Ala Gly PheGly Gly Ile Tyr Ala Val His Lys Leu Ala Asn Glu Leu 20 25 30 Gly Leu ThrThr Val Gly Phe Asp Lys Ala Asp Gly Pro Gly Gly Thr 35 40 45 Trp Tyr TrpAsn Arg Tyr Pro Gly Ala Leu Ser Asp Thr Glu Ser His 50 55 60 Val Tyr ArgPhe Ser Phe Asp Arg Asp Leu Leu Gln Asp Gly Thr Trp 65 70 75 80 Lys HisThr Tyr Thr Thr Gln Pro Glu Ile Leu Glu Tyr Leu Glu Asp 85 90 95 Val ValSer Arg Phe Asp Leu Arg Arg His Phe His Phe Gly Thr Ala 100 105 110 ValGlu Ser Ala Val Tyr Leu Glu Asp Glu Gln Leu Trp Glu Val Thr 115 120 125Thr Asp Thr Gly Glu Ile Tyr Arg Ala Thr Tyr Val Val Asn Ala Val 130 135140 Gly Leu Leu Ser Ala Ile Asn Arg Pro Asp Leu Pro Gly Leu Glu Thr 145150 155 160 Phe Glu Gly Glu Thr Ile His Thr Ala Ala Trp Pro Glu Gly LysAsp 165 170 175 Leu Thr Gly Arg Arg Val Gly Val Ile Gly Thr Gly Ser ThrGly Gln 180 185 190 Gln Val Ile Thr Ala Leu Ala Pro Thr Val Glu His LeuThr Val Phe 195 200 205 Val Arg Thr Pro Gln Tyr Ser Val Pro Val Gly LysArg Ala Val Thr 210 215 220 Asp Glu Gln Ile Asp Ala Val Lys Ala Asp TyrGlu Asn Ile Trp Thr 225 230 235 240 Gln Val Lys Arg Ser Ser Val Ala PheGly Phe Glu Glu Ser Thr Val 245 250 255 Pro Ala Met Ser Val Ser Ala GluGlu Arg Leu Arg Val Tyr Glu Glu 260 265 270 Ala Trp Glu Gln Gly Gly GlyPhe Arg Phe Met Phe Gly Thr Phe Gly 275 280 285 Asp Ile Ala Thr Asp GluGlu Ala Asn Glu Thr Ala Ala Ser Phe Ile 290 295 300 Arg Ser Lys Ile ThrAla Met Ile Glu Asp Pro Glu Thr Ala Arg Lys 305 310 315 320 Leu Thr ProThr Gly Leu Phe Ala Arg Arg Pro Leu Cys Asp Asp Gly 325 330 335 Tyr PheGln Val Phe Asn Arg Pro Asn Val Glu Ala Val Ala Ile Lys 340 345 350 GluAsn Pro Ile Arg Glu Ile Thr Ala Lys Gly Val Val Thr Glu Asp 355 360 365Gly Val Leu His Lys Leu Asp Val Leu Val Leu Ala Thr Gly Phe Asp 370 375380 Ala Val Asp Gly Asn Tyr Arg Arg Met Thr Ile Ser Gly Arg Gly Gly 385390 395 400 Leu Asn Ile Asn Asp His Trp Asp Gly Gln Pro Thr Ser Tyr LeuGly 405 410 415 Ile Ala Thr Ala Asn Phe Pro Asn Trp Phe Met Val Leu GlyPro Asn 420 425 430 Gly Pro Phe Thr Asn Leu Pro Pro Ser Ile Glu Thr GlnVal Glu Trp 435 440 445 Ile Ser Asp Thr Ile Gly Tyr Val Glu Arg Thr GlyVal Arg Ala Ile 450 455 460 Glu Pro Thr Pro Glu Ala Glu Ser Ala Trp ThrAla Thr Cys Thr Asp 465 470 475 480 Ile Ala Asn Met Thr Val Phe Thr LysVal Asp Ser Trp Ile Phe Gly 485 490 495 Ala Asn Val Pro Gly Lys Lys ProSer Val Leu Phe Tyr Leu Gly Gly 500 505 510 Leu Gly Asn Tyr Arg Ala ValLeu Ala Asp Val Thr Glu Gly Gly Tyr 515 520 525 Gln Gly Phe Ala Leu LysThr Ala Asp Thr Val Asp Ala 530 535 540 45 1638 DNA Rhodococcuserythropolis AN12 45 atgacaacta ccgaatccag aactcagacc gacaaggctggggccgtcac gctcgatgcg 60 ttgatcatcg gcgccggagt cgccggtttg tatcagctccacatgcttcg cgagcaggga 120 ctgaacgtcc gcgcctacga cgctgcggaa gacgtcggcggtacgtggta ctggaaccgt 180 tacccaggcg cacgattcga ctccgaagcc tacatctaccagtacctgtt ctccgaggac 240 ctgtacaaga actggagctg gagtcaacgc ttcccggcccagcccgaaat tgagcggtgg 300 atgcgctacg tcgccgacac cctggacctg cgtcgcagcattcagttttc cacaacaatc 360 accagcgccg agttcgacga ggtagctgag cgttggaccattcgcaccga ccgcggcgag 420 gaaatcagca cccgattctt catcacctgt tgcggaatgctgtcggcgcc gatggaagat 480 ttgttccccg gacaacagga cttccggggg cagatcttccacacctcgcg atggccgcac 540 ggagatgtag aactcaccgg taagcgtgtc ggtgtcgtcggcgtcggcgc cactggcatt 600 caggtaatcc agaccatcgc cgacgaggtt gatcaactgaaggtgttcgt gcggacaccc 660 cagtacgcct tgccgatgaa aaaccctcag tacgacagcgacgacgtcgc ggcctacaag 720 gaccgattcg aggagcttcg aaccacactg ccgcacaccttcacaggctt cgaatacgat 780 ttcgaatacg tgtgggccga cctagccccc gaacagcgccgcgaggtgct cgagaacatc 840 tacgagtacg gatcactcaa gctgtggctg tcgtcgttcgcggagatgtt cttcgatgag 900 caggtcagtg acgagatctc cgagttcgtt cgcgagaaaatgcgggcgcg gctcatcgat 960 ccggagctgt gcgacctgct gattcccact gactatggcttcggcacaca ccgtgtgccg 1020 ctcgaaacca actacctcga ggtgtaccac cgcccgaatgtgacggccat cggcgtcaag 1080 aacaacccga tcgcgcgaat cgtcccccaa ggcatcgagttgaccgacgg taccttccac 1140 gaactagacg tgatcatttt ggccactggg ttcgatgcaggcaccggcgc actgactcga 1200 atcgacatcc gcggccgcgg tggtcggtct ctgaaggaagactggggacg cgatattcgc 1260 acgacaatgg gcctgatggt gcacggttac ccgaacatgctgacgaccgc cgtgcccctg 1320 gcaccctccg cggcactgtg caacatgacc acgtgcttgcagcagcagac cgagtggatc 1380 agcgaagcaa ttcgctacat gcaagagcgc gatctgaccgtcatcgagcc taccaaggag 1440 gccgaggacg cgtgggtggc gcaccacgac gaaacagccgcagtgaatct gatctccaag 1500 acggattcct ggtacgtagg ttccaacgtt ccagggaagccgcgacgggt cctgtcctac 1560 acggggggag tcggcgcata ccgagaaaag gcgcaggaaatcgccgacgc cggatacaag 1620 ggcttcaatc tgcgctga 1638 46 545 PRTRhodococcus erythropolis AN12 46 Met Thr Thr Thr Glu Ser Arg Thr Gln ThrAsp Lys Ala Gly Ala Val 1 5 10 15 Thr Leu Asp Ala Leu Ile Ile Gly AlaGly Val Ala Gly Leu Tyr Gln 20 25 30 Leu His Met Leu Arg Glu Gln Gly LeuAsn Val Arg Ala Tyr Asp Ala 35 40 45 Ala Glu Asp Val Gly Gly Thr Trp TyrTrp Asn Arg Tyr Pro Gly Ala 50 55 60 Arg Phe Asp Ser Glu Ala Tyr Ile TyrGln Tyr Leu Phe Ser Glu Asp 65 70 75 80 Leu Tyr Lys Asn Trp Ser Trp SerGln Arg Phe Pro Ala Gln Pro Glu 85 90 95 Ile Glu Arg Trp Met Arg Tyr ValAla Asp Thr Leu Asp Leu Arg Arg 100 105 110 Ser Ile Gln Phe Ser Thr ThrIle Thr Ser Ala Glu Phe Asp Glu Val 115 120 125 Ala Glu Arg Trp Thr IleArg Thr Asp Arg Gly Glu Glu Ile Ser Thr 130 135 140 Arg Phe Phe Ile ThrCys Cys Gly Met Leu Ser Ala Pro Met Glu Asp 145 150 155 160 Leu Phe ProGly Gln Gln Asp Phe Arg Gly Gln Ile Phe His Thr Ser 165 170 175 Arg TrpPro His Gly Asp Val Glu Leu Thr Gly Lys Arg Val Gly Val 180 185 190 ValGly Val Gly Ala Thr Gly Ile Gln Val Ile Gln Thr Ile Ala Asp 195 200 205Glu Val Asp Gln Leu Lys Val Phe Val Arg Thr Pro Gln Tyr Ala Leu 210 215220 Pro Met Lys Asn Pro Gln Tyr Asp Ser Asp Asp Val Ala Ala Tyr Lys 225230 235 240 Asp Arg Phe Glu Glu Leu Arg Thr Thr Leu Pro His Thr Phe ThrGly 245 250 255 Phe Glu Tyr Asp Phe Glu Tyr Val Trp Ala Asp Leu Ala ProGlu Gln 260 265 270 Arg Arg Glu Val Leu Glu Asn Ile Tyr Glu Tyr Gly SerLeu Lys Leu 275 280 285 Trp Leu Ser Ser Phe Ala Glu Met Phe Phe Asp GluGln Val Ser Asp 290 295 300 Glu Ile Ser Glu Phe Val Arg Glu Lys Met ArgAla Arg Leu Ile Asp 305 310 315 320 Pro Glu Leu Cys Asp Leu Leu Ile ProThr Asp Tyr Gly Phe Gly Thr 325 330 335 His Arg Val Pro Leu Glu Thr AsnTyr Leu Glu Val Tyr His Arg Pro 340 345 350 Asn Val Thr Ala Ile Gly ValLys Asn Asn Pro Ile Ala Arg Ile Val 355 360 365 Pro Gln Gly Ile Glu LeuThr Asp Gly Thr Phe His Glu Leu Asp Val 370 375 380 Ile Ile Leu Ala ThrGly Phe Asp Ala Gly Thr Gly Ala Leu Thr Arg 385 390 395 400 Ile Asp IleArg Gly Arg Gly Gly Arg Ser Leu Lys Glu Asp Trp Gly 405 410 415 Arg AspIle Arg Thr Thr Met Gly Leu Met Val His Gly Tyr Pro Asn 420 425 430 MetLeu Thr Thr Ala Val Pro Leu Ala Pro Ser Ala Ala Leu Cys Asn 435 440 445Met Thr Thr Cys Leu Gln Gln Gln Thr Glu Trp Ile Ser Glu Ala Ile 450 455460 Arg Tyr Met Gln Glu Arg Asp Leu Thr Val Ile Glu Pro Thr Lys Glu 465470 475 480 Ala Glu Asp Ala Trp Val Ala His His Asp Glu Thr Ala Ala ValAsn 485 490 495 Leu Ile Ser Lys Thr Asp Ser Trp Tyr Val Gly Ser Asn ValPro Gly 500 505 510 Lys Pro Arg Arg Val Leu Ser Tyr Thr Gly Gly Val GlyAla Tyr Arg 515 520 525 Glu Lys Ala Gln Glu Ile Ala Asp Ala Gly Tyr LysGly Phe Asn Leu 530 535 540 Arg 545 47 540 PRT Artificial Sequenceconsensus sequence 47 Met Thr Ala Gln Glu Ser Leu Thr Val Val Asp AlaVal Val Ile Gly 1 5 10 15 Ala Gly Phe Gly Gly Ile Tyr Ala Val His LysLeu Arg Glu Gln Gly 20 25 30 Leu Thr Val Val Gly Phe Asp Ala Ala Asp GlyPro Gly Gly Thr Trp 35 40 45 Tyr Trp Asn Arg Tyr Pro Gly Ala Leu Ser AspThr Glu Ser His Val 50 55 60 Tyr Arg Phe Ser Phe Asp Glu Asp Leu Leu GlnAsp Trp Thr Trp Lys 65 70 75 80 Glu Thr Tyr Pro Thr Gln Pro Glu Ile LeuGlu Tyr Leu Glu Asp Val 85 90 95 Val Asp Arg Phe Asp Leu Arg Arg Asp PheArg Phe Gly Thr Glu Val 100 105 110 Thr Ser Ala Thr Tyr Leu Glu Asp GluAsn Leu Trp Glu Val Thr Thr 115 120 125 Asp Gly Gly Glu Val Tyr Arg AlaArg Phe Val Val Asn Ala Val Gly 130 135 140 Leu Leu Ser Ala Ile Asn PhePro Asn Ile Pro Gly Leu Asp Thr Phe 145 150 155 160 Glu Gly Glu Thr IleHis Thr Ala Ala Trp Pro Glu Gly Val Asp Leu 165 170 175 Thr Gly Lys ArgVal Gly Val Ile Gly Thr Gly Ser Thr Gly Ile Gln 180 185 190 Val Ile ThrAla Leu Ala Pro Glu Val Glu His Leu Thr Val Phe Val 195 200 205 Arg ThrPro Gln Tyr Ser Val Pro Val Gly Asn Arg Pro Val Thr Ala 210 215 220 GluGln Ile Asp Ala Ile Lys Ala Asp Tyr Asp Glu Ile Trp Ala Gln 225 230 235240 Val Lys Arg Ser Gly Val Ala Phe Gly Phe Glu Glu Ser Thr Val Pro 245250 255 Ala Met Ser Val Ser Glu Glu Glu Arg Asn Arg Val Phe Glu Glu Ala260 265 270 Trp Glu Glu Gly Gly Gly Phe Arg Phe Met Phe Gly Thr Phe GlyAsp 275 280 285 Ile Ala Thr Asp Glu Ala Ala Asn Glu Thr Ala Ala Ser PheIle Arg 290 295 300 Ser Lys Ile Arg Glu Ile Val Lys Asp Pro Glu Thr AlaArg Lys Leu 305 310 315 320 Thr Pro Thr Gly Leu Phe Ala Arg Arg Arg LeuCys Asp Asp Gly Tyr 325 330 335 Tyr Glu Val Tyr Asn Arg Pro Asn Val GluAla Val Asp Ile Lys Glu 340 345 350 Asn Pro Ile Arg Glu Ile Thr Ala LysGly Val Val Thr Glu Asp Gly 355 360 365 Val Leu His Glu Leu Asp Val LeuVal Phe Ala Thr Gly Phe Asp Ala 370 375 380 Val Asp Gly Asn Tyr Arg ArgIle Asp Ile Arg Gly Arg Gly Gly Leu 385 390 395 400 Ser Leu Asn Asp HisTrp Asp Gly Gln Pro Thr Ser Tyr Leu Gly Leu 405 410 415 Ser Thr Ala GlyPhe Pro Asn Trp Phe Met Val Leu Gly Pro Asn Gly 420 425 430 Pro Phe ThrAsn Leu Pro Pro Ser Ile Glu Thr Gln Val Glu Trp Ile 435 440 445 Ser AspThr Ile Ala Tyr Ala Glu Glu Asn Gly Ile Arg Ala Ile Glu 450 455 460 ProThr Pro Glu Ala Glu Asp Glu Trp Thr Ala Thr Cys Thr Asp Ile 465 470 475480 Ala Asn Ala Thr Leu Phe Thr Lys Ala Asp Ser Trp Ile Phe Gly Ala 485490 495 Asn Val Pro Gly Lys Lys Pro Ser Val Leu Phe Tyr Leu Gly Gly Leu500 505 510 Gly Asn Tyr Arg Ala Val Leu Ala Asp Val Ala Ala Ala Gly TyrArg 515 520 525 Gly Phe Ala Leu Lys Ser Ala Asp Ala Val Thr Ala 530 535540 48 497 PRT Artificial Sequence consensus sequence 48 Met Val Xaa IlePro Xaa Arg His Xaa Glu Val Val Ile Ile Gly Ala 1 5 10 15 Gly Phe AlaGly Ile Gly Ala Ala Val Glu Leu Lys Arg Xaa Gly Ile 20 25 30 Asp Asp PheVal Leu Leu Glu Arg Ala Asp Asp Val Gly Gly Thr Trp 35 40 45 Arg Asp AsnThr Tyr Pro Gly Ala Ala Cys Asp Val Pro Ser Xaa Leu 50 55 60 Tyr Ser TyrSer Phe Ala Pro Asn Pro Asn Trp Thr Arg Leu Phe Ala 65 70 75 80 Xaa GlnPro Glu Ile Tyr Asp Tyr Leu Glu Asp Val Ala Ala Xaa Xaa 85 90 95 Gly LeuXaa Xaa His Val Arg Phe Gly Val Glu Val Thr Glu Ala Arg 100 105 110 TrpAsp Glu Ser Ala Gln Leu Trp Arg Val Xaa Thr Ala Ser Gly Glu 115 120 125Leu Thr Ala Xaa Phe Leu Val Ala Ala Thr Gly Pro Leu Ser Xaa Pro 130 135140 Lys Ile Pro Asp Leu Pro Gly Leu Glu Ser Phe Glu Gly Xaa Xaa Phe 145150 155 160 His Ser Ala Xaa Trp Asn His Asp Leu Asp Leu Arg Gly Glu ArgVal 165 170 175 Ala Val Val Gly Thr Gly Ala Ser Ala Val Gln Phe Val ProGlu Ile 180 185 190 Ala Asp Xaa Ala Xaa Thr Leu Thr Val Phe Gln Arg ThrPro Gln Trp 195 200 205 Val Leu Pro Arg Pro Asp Xaa Thr Leu Pro Xaa AlaXaa Arg Ala Val 210 215 220 Phe Ser Arg Val Pro Gly Thr Gln Lys Trp LeuArg Xaa Arg Leu Tyr 225 230 235 240 Gly Ile Phe Glu Ala Leu Gly Ser GlyPhe Val Xaa Pro Xaa Trp Leu 245 250 255 Leu Pro Xaa Xaa Xaa Ala Leu AlaArg Ala His Leu Arg Arg Gln Val 260 265 270 Arg Asp Pro Glu Leu Arg XaaLys Leu Thr Pro Asp Tyr Thr Pro Gly 275 280 285 Cys Lys Arg Met Leu LeuSer Asn Asp Trp Tyr Pro Ala Leu Xaa Lys 290 295 300 Pro Asn Val Ser LeuVal Thr Ser Gly Val Val Glu Val Thr Glu Xaa 305 310 315 320 Gly Val ValAsp Ala Asp Gly Val Glu His Glu Val Asp Thr Ile Ile 325 330 335 Phe AlaThr Gly Phe His Xaa Thr Asp Xaa Pro Xaa Ala Met Lys Ile 340 345 350 PheGly Arg Glu Gly Arg Ser Leu Ala Asp His Trp Asn Gly Ser Ala 355 360 365Xaa Ala Tyr Leu Gly Thr Ala Val Ser Gly Phe Pro Asn Leu Phe Xaa 370 375380 Leu Leu Gly Pro Asn Thr Gly Leu Gly His Thr Ser Ile Val Xaa Ile 385390 395 400 Leu Glu Ala Gln Ala Glu Tyr Ile Ala Ser Ala Leu Xaa Xaa MetArg 405 410 415 Arg Glu Gly Leu Gly Ala Leu Asp Val Arg Ala Glu Val GlnXaa Xaa 420 425 430 Phe Asn Xaa Ala Val Gln Glu Arg Leu Ala Thr Thr ValTrp Asn Ala 435 440 445 Gly Gly Cys Ser Ser Trp Tyr Xaa Asp Pro Asp GlyArg Asn Ser Thr 450 455 460 Xaa Trp Pro Trp Ser Thr Xaa Xaa Phe Arg AlaArg Thr Arg Arg Phe 465 470 475 480 Asp Pro Ser Asp Tyr Xaa Pro Ser SerPro Thr Pro Glu Thr Xaa Xaa 485 490 495 Gly 49 471 PRT ArtificialSequence consensus sequence 49 Met Ser Thr Glu His Leu Asp Val Leu IleIle Gly Ala Gly Leu Ser 1 5 10 15 Gly Ile Gly Ala Ala Xaa Arg Leu XaaArg Glu Xaa Gly Ile Xaa Phe 20 25 30 Ala Ile Leu Glu Ala Arg Asp Asn ValGly Gly Thr Trp Asp Leu Phe 35 40 45 Asn Tyr Pro Gly Ile Arg Ser Asp SerAsp His Leu Thr Xaa Gly Lys 50 55 60 Gly Ala Phe Arg Pro Phe Pro Xaa AlaLys Xaa Leu Ala Asp Gly Pro 65 70 75 80 Ser His Glu Leu Xaa Xaa Tyr ValArg Asp Thr Ala Xaa Glu Xaa Gly 85 90 95 Leu Arg Xaa His Ile Xaa Phe GlyThr Lys Val Val Ala Ala Xaa Xaa 100 105 110 Xaa Ala Xaa Ser Leu Trp ThrVal Thr Val Xaa Xaa Xaa Gly Glu Thr 115 120 125 Glu Val Xaa Thr Tyr AsnVal Leu Xaa Xaa Ala Asn Gly Tyr Tyr Ser 130 135 140 Tyr Asp Lys Gly AsnIle Pro Asp Phe Pro Gly Glu Phe Xaa Gly Xaa 145 150 155 160 Leu Val HisPro Gln Xaa Tyr Pro Glu Xaa Leu Asp Tyr Arg Gly Lys 165 170 175 Lys ValVal Val Ile Gly Ser Gly Ala Ser Gly Xaa Thr Leu Ala Pro 180 185 190 XaaMet Xaa Xaa Xaa Ala Xaa His Val Thr Met Leu Gln Arg Ser Gly 195 200 205Thr Tyr Ile Ala Leu Pro Ser Asp Ala Val Val Pro Xaa Gln Leu Ala 210 215220 Gly Xaa Arg Xaa Xaa Xaa Xaa Xaa Leu Gln Xaa Xaa Gln Leu Arg Xaa 225230 235 240 Pro Pro Trp Xaa Ala Lys Arg Leu Xaa Leu Leu Leu Ile Arg ArgGln 245 250 255 Leu Gly Lys Asn Val Xaa Leu Xaa Gly Phe Pro Thr Pro SerTyr Xaa 260 265 270 Pro Trp Asp Gln His Leu Cys Val Val Pro Asn Gly AspLeu Leu Lys 275 280 285 Xaa Leu Gly Ser Gly Asp Ala Xaa Ile Xaa Thr AspIle Asp Thr Phe 290 295 300 Thr Gly Lys Gly Val Xaa Phe Ala Ser Gly ArgGlu Xaa Asp Ala Asp 305 310 315 320 Val Val Val Thr Ala Thr Gly Leu AsnXaa Xaa Xaa Gly Gly Pro Phe 325 330 335 Ile Xaa Xaa Asp Gly Leu Leu ValAsp Leu Xaa Xaa Arg Xaa Ala Leu 340 345 350 Phe Tyr Lys Xaa Xaa Xaa XaaSer Asp Asn Leu Asn Phe Leu Gly Xaa 355 360 365 Val Gly Tyr Thr Asn AlaSer Trp Thr Leu Arg Ala Asp Leu Ala Xaa 370 375 380 Leu Val Ala Cys ArgLeu Leu Xaa Xaa Met Xaa Xaa Arg Ser Ala Xaa 385 390 395 400 Xaa Xaa XaaXaa His Ala Xaa Ala Glu Xaa Xaa Xaa Xaa Leu Leu Ala 405 410 415 Ser GlyTyr Lys Xaa Arg Xaa Xaa Gly Xaa Met Pro Xaa Gln Gly Xaa 420 425 430 LysXaa Xaa Trp Xaa Xaa Xaa Xaa Asn Tyr Xaa Xaa Asp Arg Xaa Leu 435 440 445Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Phe Ser Lys Xaa 450 455460 Xaa Xaa Ala Xaa Xaa Xaa Xaa 465 470 50 19 DNA Artificial SequencePrimer HK12 50 gagtttgatc ctggctcag 19 51 18 DNA Artificial SequencePrimer 51 caggmgccgc ggtaatwc 18 52 18 DNA Artificial Sequence PrimerHK21 52 gctgcctccc gtaggagt 18 53 19 DNA Artificial Sequence Primer 53ctaccagggt aactaatcc 19 54 15 DNA Artificial Sequence Primer 54acgggcggtg tgtac 15 55 20 DNA Artificial Sequence Primer 55 cacgagctgacgacagccat 20 56 16 DNA Artificial Sequence Primer HK13 56 taccttgttacgactt 16 57 18 DNA Artificial Sequence Primer 57 gwattaccgc ggckgctg 1858 19 DNA Artificial Sequence Primer 58 ggattagata ccctggtag 19 59 20DNA Artificial Sequence Primer 59 atggctgtcg tcagctcgtg 20 60 16 DNAArtificial Sequence Primer HK15 60 gcccccgyca attcct 16 61 17 DNAArtificial Sequence Primer HK14 61 gtgccagcag ymgcggt 17 62 16 DNAArtificial Sequence Primer JCR15 62 gccagcagcc gcggta 16 63 17 DNAArtificial Sequence Primer 63 cggagcagat cgavvvv 17 64 17 DNA ArtificialSequence M13 Reverse Primer 64 caggaaacag ctatgac 17 65 16 DNAArtificial Sequence M13 (-20) Forward Primer 65 ctggccgtcg ttttac 16 6634 DNA Acinetobacter sp. NCIB 9871 66 gagtctgagc atatgtcaca aaaaatggattttg 34 67 39 DNA Acinetobacter sp. NCIB 9871 67 gagtctgagg gatccttaggcattggcagg ttgcttgat 39 68 25 DNA Brevibacterium sp. HCU 68 atgccaattacacaacaact tgacc 25 69 23 DNA Brevibacterium sp. HCU 69 ctatttcatacccgccgatt cac 23 70 22 DNA Brevibacterium sp. HCU 70 atgacgtcaaccatgcctgc ac 22 71 21 DNA Brevibacterium sp. HCU 71 cacttaagtcgcattcagcc c 21 72 21 DNA Acinetobacter sp. SE19 72 atggattttgatgctatcgt g 21 73 19 DNA Acinetobacter sp. SE19 73 ggcattggca ggttgcttg19 74 22 DNA Arthrobacter sp. BP2 74 atgactgcac agaacacttt cc 22 75 18DNA Arthrobacter sp. BP2 75 tcaaagccgc ggtatccg 18 76 23 DNA Rhodococcussp. phi1 76 atgactgcac agatctcacc cac 23 77 22 DNA Rhodococcus sp. phi177 tcaggcggtc accgggacag cg 22 78 23 DNA Rhodococcus sp. phi2 78atgaccgcac agaccatcca cac 23 79 20 DNA Rhodococcus sp. phi2 79tcagaccgtg accatctcgg 20 80 21 DNA Brachymonas sp. CHX 80 atgtcttcctcgccaagcag c 21 81 21 DNA Brachymonas sp. CHX 81 cagtggttgg aacgcaaagc c21 82 23 DNA Rhodococcus erythropolis AN12 82 atgagcacag agggcaagta cgc23 83 25 DNA Rhodococcus erythropolis AN12 83 tcagtccttg ttcacgtagtaggcc 25 84 23 DNA Rhodococcus erythropolis AN12 84 atggtcgacatcgacccaac ctc 23 85 24 DNA Rhodococcus erythropolis AN12 85 ttatcggctcctcacggttt ctcg 24 86 24 DNA Rhodococcus erythropolis AN12 86 atgaccgatcctgacttctc cacc 24 87 24 DNA Rhodococcus erythropolis AN12 87 tcatgcgtgcaccgcactgt tcag 24 88 23 DNA Rhodococcus erythropolis AN12 88 atgagcccctcccccttgcc gag 23 89 24 DNA Rhodococcus erythropolis AN12 89 tcatgcgcgatccgccttct cgag 24 90 24 DNA Rhodococcus erythropolis AN12 90 gtgaacaacgaatctgacca cttc 24 91 23 DNA Rhodococcus erythropolis AN12 91 tcatgcggtgtactccggtt ccg 23 92 22 DNA Rhodococcus erythropolis AN12 92 atgagcaccgaacacctcga tg 22 93 23 DNA Rhodococcus erythropolis AN12 93 tcaactcttgctcggtaccg gcg 23 94 26 DNA Rhodococcus erythropolis AN12 94 atgacagacgaattcgacgt agtgat 26 95 23 DNA Rhodococcus erythropolis AN12 95tcagctctgg ttcacaggga cgg 23 96 23 DNA Rhodococcus erythropolis AN12 96atggcggaga tagtcaatgg tcc 23 97 22 DNA Rhodococcus erythropolis AN12 97tcaccctcgc gcggtcggag tc 22 98 26 DNA Rhodococcus erythropolis AN12 98gtgaagcttc ccgaacatgt cgaaac 26 99 25 DNA Rhodococcus erythropolis AN1299 tcatgcctgg acgctttcga tcttg 25 100 25 DNA Rhodococcus erythropolisAN12 100 atgacacagc atgtcgacgt actga 25 101 24 DNA Rhodococcuserythropolis AN12 101 ctatgcgctg gcgaccttgc tatc 24 102 25 DNARhodococcus erythropolis AN12 102 atgtcatcac gggtcaacga cggcc 25 103 24DNA Rhodococcus erythropolis AN12 103 tcatcctttg cctgtcgtca gtgc 24 10424 DNA Rhodococcus erythropolis AN12 104 atgactacac aaaaggccct gacc 24105 22 DNA Rhodococcus erythropolis AN12 105 tcaggcgtcg acggtgtcgg cc 22106 25 DNA Rhodococcus erythropolis AN12 106 atgacaacta ccgaatccag aactc25 107 26 DNA Rhodococcus erythropolis AN12 107 tcagcgcaga ttgaagcccttgtatc 26 108 20 DNA Artificial Sequence Primer A102FI for screeningArthrobacter sp. BP2 library 108 gcacacctac atcacccagc 20 109 17 DNAArtificial Sequence Primer CONR for screening Arthrobacter sp. BP2library 109 ccgcccaggt agaacag 17 110 24 DNA Artificial Sequence PrimerA228FI for screening Rhodococcus sp. phi2 library 110 ggatctcgatccggcggtag ttgc 24 111 23 DNA Artificial Sequence Primer A228RI forscreening Rhodococcus sp. phi2 library 111 gctgatgccg accggtctgt acg 23112 23 DNA Artificial Sequence Primer A2FI for screening Rhodococcus sp.phi1 library 112 ccacagttgt cgacgccgtt gtc 23 113 22 DNA ArtificialSequence Primer A34RI for screening Rhodococcus sp. phi1 library 113tcgaaacctc ggtagctgtc gg 22

What is claimed is:
 1. An isolated nucleic acid fragment selected fromthe group consisting of: (a) an isolated nucleic acid fragment encodinga Baeyer-Villiger monooxygenase polypeptide having an amino acidsequence selected from the group consisting of SEQ ID NOs:8, 10, 22, 24,26, 28, 30, 32, 34, 36, 38, 40, 42, 44, and 46; (b) an isolated nucleicacid molecule encoding a Baeyer-Villiger monooxygenase polypeptide thathybridizes with (a) under the following hybridization conditions:0.1×SSC, 0.1% SDS, 65° C. and washed with 2×SSC, 0.1% SDS followed by0.1×SSC, 0.1% SDS; or an isolated nucleic acid fragment that iscomplementary to (a) or (b).
 2. An isolated nucleic acid moleculecomprising a first nucleotide sequence encoding a polypeptide of atleast 542 amino acids that has at least 55% identity based on theSmith-Waterman method of alignment when compared to a polypeptide havingthe sequence as set forth in SEQ ID NO:8 or a second nucleotide sequencecomprising the complement of the first nucleotide sequence.
 3. Anisolated nucleic acid molecule comprising a first nucleotide sequenceencoding a polypeptide of at least 541 amino acids that has at least 53%identity based on the Smith-Waterman method of alignment when comparedto a polypeptide having the sequence as set forth in SEQ ID NO:10 or asecond nucleotide sequence comprising the complement of the firstnucleotide sequence.
 4. An isolated nucleic acid molecule comprising afirst nucleotide sequence encoding a polypeptide of at least 439 aminoacids that has at least 37% identity based on the Smith-Waterman methodof alignment when compared to a polypeptide having the sequence as setforth in SEQ ID NO:22 or a second nucleotide sequence comprising thecomplement of the first nucleotide sequence.
 5. An isolated nucleic acidmolecule comprising a first nucleotide sequence encoding a polypeptideof at least 518 amino acids that has at least 44% identity based on theSmith-Waterman method of alignment when compared to a polypeptide havingthe sequence as set forth in SEQ ID NO:24 or a second nucleotidesequence comprising the complement of the first nucleotide sequence. 6.An isolated nucleic acid molecule comprising a first nucleotide sequenceencoding a polypeptide of at least 541 amino acids that has at least 64%identity based on the Smith-Waterman method of alignment when comparedto a polypeptide having the sequence as set forth in SEQ ID NO:26 or asecond nucleotide sequence comprising the complement of the firstnucleotide sequence.
 7. An isolated nucleic acid molecule comprising afirst nucleotide sequence encoding a polypeptide of at least 462 aminoacids that has at least 65% identity based on the Smith-Waterman methodof alignment when compared to a polypeptide having the sequence as setforth in SEQ ID NO:28 or a second nucleotide sequence comprising thecomplement of the first nucleotide sequence.
 8. An isolated nucleic acidmolecule comprising a first nucleotide sequence encoding a polypeptideof at least 523 amino acids that has at least 45% identity based on theSmith-Waterman method of alignment when compared to a polypeptide havingthe sequence as set forth in SEQ ID NO:30 or a second nucleotidesequence comprising the complement of the first nucleotide sequence. 9.An isolated nucleic acid molecule comprising a first nucleotide sequenceencoding a polypeptide of at least 493 amino acids that has at least 55%identity based on the Smith-Waterman method of alignment when comparedto a polypeptide having the sequence as set forth in SEQ ID NO:32 or asecond nucleotide sequence comprising the complement of the firstnucleotide sequence.
 10. An isolated nucleic acid molecule comprising afirst nucleotide sequence encoding a polypeptide of at least 539 aminoacids that has at least 51% identity based on the Smith-Waterman methodof alignment when compared to a polypeptide having the sequence as setforth in SEQ ID NO:34 or a second nucleotide sequence comprising thecomplement of the first nucleotide sequence.
 11. An isolated nucleicacid molecule comprising a first nucleotide sequence encoding apolypeptide of at least 649 amino acids that has at least 39% identitybased on the Smith-Waterman method of alignment when compared to apolypeptide having the sequence as set forth in SEQ ID NO:36 or a secondnucleotide sequence comprising-the complement of the first nucleotidesequence.
 12. An isolated nucleic acid molecule comprising a firstnucleotide sequence encoding a polypeptide of at least 494 amino acidsthat has at least 43% identity based on the Smith-Waterman method ofalignment when compared to a polypeptide having the sequence as setforth in SEQ ID NO:38 or a second nucleotide sequence comprising thecomplement of the first nucleotide sequence.
 13. An isolated nucleicacid molecule comprising a first nucleotide sequence encoding apolypeptide of at least 499 amino acids that has at least 53% identitybased on the Smith-Waterman method of alignment when compared to apolypeptide having the sequence as set forth in SEQ ID NO:40 or a secondnucleotide sequence comprising the complement of the first nucleotidesequence.
 14. An isolated nucleic acid molecule comprising a firstnucleotide sequence encoding a polypeptide of at least 493 amino acidsthat has at least 44% identity based on the Smith-Waterman method ofalignment when compared to a polypeptide having the sequence as setforth in SEQ ID NO:42 or a second nucleotide sequence comprising thecomplement of the first nucleotide sequence.
 15. An isolated nucleicacid molecule comprising a first nucleotide sequence encoding apolypeptide of at least 541 amino acids that has at least 54% identitybased on the Smith-Waterman method of alignment when compared to apolypeptide having the sequence as set forth in SEQ ID NO:44 or a secondnucleotide sequence comprising the complement of the first nucleotidesequence.
 16. An isolated nucleic acid molecule comprising a firstnucleotide sequence encoding a polypeptide of at least 545 amino acidsthat has at least 42% identity based on the Smith-Waterman method ofalignment when compared to a polypeptide having the sequence as setforth in SEQ ID NO:46 or a second nucleotide sequence comprising thecomplement of the first nucleotide sequence.
 17. The isolated nucleicacid fragment of claim 1 selected from the group consisting of SEQ IDNOs:7, 9, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, and
 45. 18. Anisolated nucleic acid fragment of claim 1 isolated from Rhodococcus. 19.A polypeptide encoded by the isolated nucleic acid fragment of claim 1.20. The polypeptide of claim 19 selected from the group consisting ofSEQ ID NOs:8, 10, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, and46.
 21. An isolated nucleic acid fragment selected from the groupconsisting of: (a) an isolated nucleic acid fragment encoding aBaeyer-Villiger monooxygenase polypeptide having an amino acid sequenceas set forth in SEQ ID NO:12; (b) an isolated nucleic acid moleculeencoding a Baeyer-Villiger monooxygenase polypeptide that hybridizeswith (a) under the following hybridization conditions: 0.1×SSC, 0.1%SDS, 65° C. and washed with 2×SSC, 0.1% SDS followed by 0.1×SSC, 0.1%SDS; or an isolated nucleic acid fragment that is complementary to (a),or (b).
 22. An isolated nucleic acid molecule comprising a firstnucleotide sequence encoding a polypeptide of at least 532 amino acidsthat has at least 57% identity based on the Smith-Waterman method ofalignment when compared to a polypeptide having the sequence as setforth in SEQ ID NO:11 or a second nucleotide sequence comprising thecomplement of the first nucleotide sequence.
 23. An isolated nucleicacid fragment of claim 21 isolated from Arthrobacter.
 24. A polypeptideencoded by the isolated nucleic acid fragment of claim
 21. 25. Thepolypeptide of claim 24 as set forth in SEQ ID NO:12.
 26. An isolatednucleic acid fragment selected from the group consisting of: (a) anisolated nucleic acid fragment encoding a Baeyer-Villiger monooxygenasepolypeptide having an amino acid sequence as set forth in SEQ ID NO:18;(b) an isolated nucleic acid molecule encoding a Baeyer-Villigermonooxygenase polypeptide that hybridizes with (a) under the followinghybridization conditions: 0.1×SSC, 0.1% SDS, 65° C. and washed with2×SSC, 0.1% SDS followed by 0.1×SSC, 0.1% SDS; or an isolated nucleicacid fragment that is complementary to (a), or (b).
 27. An isolatednucleic acid molecule comprising a first nucleotide sequence encoding apolypeptide of at least 538 amino acids that has at least 57% identitybased on the Smith-Waterman method of alignment when compared to apolypeptide having the sequence as set forth in SEQ ID NO:17 or a secondnucleotide sequence comprising the complement of the first nucleotidesequence.
 28. An isolated nucleic acid fragment of claim 26 isolatedfrom Acidovorax.
 29. A polypeptide encoded by the isolated nucleic acidfragment of claim
 26. 30. The polypeptide of claim 29 selected from thegroup consisting of SEQ ID NO:18.
 31. A chimeric gene comprising theisolated nucleic acid fragment of any one of claims 1, 19, 25, 30, or 35operably linked to suitable regulatory sequences.
 32. A transformed hostcell comprising a host cell and the chimeric gene of claim
 31. 33. Thetransformed host cell of claim 32 wherein the host cell is selected fromthe group consisting of bacteria, yeast, filamentous fungi, and greenplants.
 34. The transformed host cell of claim 33 wherein the host cellis selected from the group consisting of proteobacteria andactinomycetes.
 35. The transformed host cell of claim 34 wherein thehost cell is selected from the group consisting of Burkholderia,Alcaligenes, Pseudomonas, Sphingomonas, Pandoraea, Delftia andComamonas.
 36. The transformed host cell of claim 33 wherein the hostcell is selected from the group consisting of Rhodococcus,Acinetobacter, Mycobacteria, Nocardia, Arthrobacter, Brevibacterium,Acidovorax, Bacillus, Streptomyces, Escherichia, Salmonella,Pseudomonas, Aspergillus, Saccharomyces, Pichia, Candida,Corynebacterium, and Hansenula.
 37. The transformed host cell of claim33 wherein the host cell is selected from the group consisting ofsoybean, rapeseed, sunflower, cotton, corn, tobacco, alfalfa, wheat,barley, oats, sorghum, rice, Arabidopsis, cruciferous vegetables,melons, carrots, celery, parsley, tomatoes, potatoes, strawberries,peanuts, grapes, grass seed crops, sugar beets, sugar cane, beans, peas,rye, flax, hardwood trees, softwood trees, and forage grasses
 38. Amethod of obtaining a nucleic acid fragment encoding a Baeyer-Villigermonooxygenase polypeptide comprising: (a) probing a genomic library withthe nucleic acid fragment of any one of claims 1, 21, or 26; (b)identifying a DNA clone that hybridizes with the nucleic acid fragmentof any one of claims 1, 21, or 26; (c) sequencing the genomic fragmentthat comprises the clone identified in step (b); wherein the sequencedgenomic fragment encodes a Baeyer-Villiger monooxygenase polypeptide.39. A method of obtaining a nucleic acid fragment encoding aBaeyer-Villiger monooxygenase polypeptide comprising: (a) synthesizingat least one oligonucleotide primer corresponding to a portion of theisolated nucleic acid sequence of any one of claims 1, 21, or 26; and(b) amplifying an insert present in a cloning vector using theoligonucleotide primer of step (a); wherein the amplified insert encodesa Baeyer-Villiger monooxygenase polypeptide.
 40. A method for theidentification of a polypeptide having monooxygenase activitycomprising: (a) obtaining the amino acid sequence of a polypeptidesuspected of having monooxygenase activity; and (b) aligning the aminoacid sequence of step (a) with the amino acid sequence of aBaeyer-Villiger monooxygenase consensus sequence selected from the groupconsisting of SEQ ID NO:47, SEQ ID NO:48 and SEQ ID NO:49; wherein whereat least 80% of the amino acid residues at positions p1-p74 of SEQ IDNO:47, or at least 80% of the amino acid residues at p1-p76 of SEQ IDNO:48 or at least 80% of the amino acid residues of p1-p41 of SEQ IDNO:49 are completely conserved, the polypeptide of (a) is identified ashaving monooxygenase activity.
 41. A method according to claim 40wherein least 100% of the amino acid residues at positions p1-p74 of SEQID NO:47, or at least 100% of the amino acid residues at p1-p76 of SEQID NO:48 or at least 100% of the amino acid residues of p1-p41 of SEQ IDNO:49 are completely conserved.
 42. A method for identifying a geneencoding a Baeyer-Villiger monooxygenase polypeptide comprising: (a)probing a genomic library with a nucleic acid fragment encoding apolypeptide wherein where at least 80% of the amino acid residues atpositions p1-p74 of SEQ ID NO:47, or at least 80% of the amino acidresidues at p1-p76 of SEQ ID NO:48 or at least 80% of the amino acidresidues of p1-p41 of SEQ ID NO:49 are completely conserved; (b)identifying a DNA clone that hybridizes with a nucleic acid fragment ofstep (a); (c) sequencing the genomic fragment that comprises the cloneidentified in step (b); wherein the sequenced genomic fragment encodes aBaeyer-Villiger monooxygenase polypeptide.
 43. A method according toclaim 42 wherein least 100% of the amino acid residues at positionsp1-p74 of SEQ ID NO:47, or at least 100% of the amino acid residues atp1-p76 of SEQ ID NO:48 or at least 100% of the amino acid residues ofp1-p41 of SEQ ID NO:49 are completely conserved.
 44. The product ofeither of claims 40 or
 42. 45. A method for the biotransformation of aketone substrate to the corresponding ester, comprising: contacting atransformed host cell under suitable growth conditions with an effectiveamount of ketone substrate whereby the corresponding ester is produced,said transformed host cell comprising a nucleic acid fragment encodingan isolated nucleic acid fragment of any of claims 1, 21, 26 or 44;under the control of suitable regulatory sequences.
 46. The method ofclaim 45 wherein the ketone substrate is selected from the groupconsisting of cyclic ketones and ketoterpenes having the generalformula:

wherein R and R₁ are independently selected from substituted orunsubstituted phenyl, substituted or unsubstituted alkyl, or substitutedor unsubstituted alkenyl or substituted or unsubstituted alkylidene. 47.The method of claim 46 wherein the ketone substrate is selected from thegroup consisting of Norcamphor, Cyclobutanone, Cyclopentanone,2-methyl-cyclopentanone, Cyclohexanone, 2-methyl-cyclohexanone,Cyclohex-2-ene-1-one, 1,2-cyclohexanedione, 1,3-cyclohexanedione,1,4-cyclohexanedione, Cycloheptanone, Cyclooctanone, Cyclodecanone,Cycloundecanone, Cyclododecanone, Cyclotridecanone, Cyclopenta-decanone,2-tridecanone, dihexyl ketone, 2-phenyl-cyclohexanone, Oxindole,Levoglucosenone, dimethyl sulfoxide, dimethy-2-piperidone, Phenylboronicacid, and beta-ionone.
 48. A method for the in vitro transformation of aketone substrate to the corresponding ester, comprising: contacting aketone substrate under suitable reaction conditions with an effectiveamount of a Baeyer-Villiger monooxygenase enzyme, the enzyme having anamino acid seqeunce selected from the group consisting of SEQ ID NOs:8,10, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, and
 46. 49. A methodaccording to claim 49 wherein the ketone substrate is selected from thegroup consisting of cyclic ketones and ketoterpenes having the generalformula:

wherein R and R₁ are independently selected from substituted orunsubstituted phenyl, substituted or unsubstituted alkyl, or substitutedor unsubstituted alkenyl or substituted or unsubstituted alkylidene. 50.A method according to claim 48 wherein the ketone substrate is selectedfrom the group consisting of Norcamphor, Cyclobutanone, Cyclopentanone,2-methyl-cyclopentanone, Cyclohexanone, 2-methyl-cyclohexanone,Cyclohex-2-ene-1-one, 1,2-cyclohexanedione, 1,3-cyclohexanedione,1,4-cyclohexanedione, Cycloheptanone, Cyclooctanone, Cyclodecanone,Cycloundecanone, Cyclododecanone, Cyclotridecanone, Cyclopenta-decanone,2-tridecanone, dihexyl ketone, 2-phenyl-cyclohexanone, Oxindole,Levoglucosenone, dimethyl sulfoxide, dimethy-2-piperidone, Phenylboronicacid, and beta-ionone.
 51. A mutated microbial gene encoding a proteinhaving an altered biological activity produced by a method comprisingthe steps of (i) digesting a mixture of nucleotide sequences withrestriction endonucleases wherein said mixture comprises: a) a nativemicrobial gene selected from the group consisting of SEQ ID NOs:7, 9,11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, and45; b) a first population of nucleotide fragments which will hybridizeto said native microbial sequence; c) a second population of nucleotidefragments which will not hybridize to said native microbial sequence;wherein a mixture of restriction fragments are produced; (ii) denaturingsaid mixture of restriction fragments; (iii) incubating the denaturedsaid mixture of restriction fragments of step (ii) with a polymerase;(iv) repeating steps (ii) and (iii) wherein a mutated microbial gene isproduced encoding a protein having an altered biological activity. 52.An Acidovorax sp. comprising the 16s rDNA sequence as set forth in SEQID NO:5
 53. An Arthrobacter sp. comprising the 16s rDNA sequence as setforth in SEQ ID NO:1
 54. A Rhodococcus sp. comprising the 16s rDNAsequence as set forth in SEQ ID NO:6
 55. An isolated nucleic acid usefulfor the identification of a BV monooxygenase selected from the groupconsisting of SEQ ID 70-113.